• How to grab specific element to value

  • If you need help with a project or need to know how to do something specific in VB.NET then please ask your questions in here.
If you need help with a project or need to know how to do something specific in VB.NET then please ask your questions in here.
Forum rules: Please LOCK your topics once you have found the solution to your question so we know you no longer require help with your query.
 #82712  by KraZy
 Fri Jan 02, 2015 7:38 pm
I've this string value:
Code: Select all
<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>
I want to delete all and take only Inter; This is just a sample value, P. Osvaldo is not always the same, then I would find a general algorithm.
I find this regex on the internet:
Code: Select all
(?<=>)[^>]*(?=</a>$)
but it doesn't work
Code: Select all
 .Giocatore Regex.Match (Content, "<td class =" "large player-link" "\ s *> (. +?) </td>").Groups(1).ToString
this variable contains a value grab in a web page, the contenent grabbed is
Code: Select all
<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>
so i want elaborate all pattern
 #82713  by mandai
 Fri Jan 02, 2015 8:39 pm
If you parse the XML you could use this code to get the title attribute:
Code: Select all
    Private Sub btnTest_Click(sender As System.Object, e As System.EventArgs) Handles btnTest.Click

        Dim input As String = "<a href=""/teams/italy/fc-internazionale-milano/1244/"" title=""Inter"">Inter</a>"""
        Dim inputReader As StringReader = New StringReader(input)

        Dim inputXml As XmlTextReader = New XmlTextReader(inputReader)
        inputXml.Read()

        Dim title As String = inputXml.GetAttribute("title")

        MsgBox(title)

    End Sub
 #82714  by visualtech
 Fri Jan 02, 2015 9:01 pm
Hey! I can propose a solution, but it can be a little cumbursome; but rest assured, you'll get the results! This solutions is based on a really simple logic without REGEX.
Code: Select all

// Considering you have the string stored in this variable
String s = "<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>"

// Now, we'll iterate over each char till we get the ">", which means the first part's ending. 

int offset = 0; // Stores the location of the "<"

for (int i = 0; i < s.length; i++) {
  if (s[i] == '>' && s[i - 1] == '\"') {
    offset++;
  }
}

s = s.substring(offset); // Now we have removed the anchor's opening tag, the closing one remains constant.

s = s.replace("</a>", "");

// Now, "s" should contain the value.
MessageBox.Show(s);

Damn this internet! #mandai beat me! :P
 #82718  by KraZy
 Sat Jan 03, 2015 11:59 am
A last question, i've fixed the problem with one of our solution, but now I want understand because this regex not function with some link:
Code: Select all
 Dim Table_Rgx As MatchCollection = Regex.Matches(Code, "<tr class=""(odd|even)"" data-people_id=""\d+"" data-team_id=""(\d+)"">(.+?)</tr>", RegexOptions.Singleline)
I valorize the "Code" variable with a link, in particulare for this link:

http://it.soccerway.com/a/block_competi ... %22%3A1%7D

working good and I haven't problem, but for this link:

http://it.soccerway.com/a/block_competi ... %22%3A2%7D

seems that the Table_Rgx variables doesn't valorized with occurence.
I use in the future this variable to make a loop and monetize other variable.

What's wrong? dunnno;
 #82719  by comathi
 Sat Jan 03, 2015 1:25 pm
What value are you trying to isolate exactly? It seems to me like the code you just posted won't isolate something along the lines of the first example you posted in the original post :?
 #82720  by KraZy
 Sat Jan 03, 2015 2:01 pm
If you check the links that I previously posted you find this class:
Code: Select all
<tr class=\"odd\" data-people_id=\"122\" data-team_id=\"1242\">
If you try the regex with the first link you can see that the regex function, but with the second link the regex doesn't function, so the table.. variables doesn't valorized.
 #82721  by comathi
 Sat Jan 03, 2015 2:59 pm
In the second link you provided, none of the <tr> elements have a data-team_id attribute, so the REGEX won't match them. However, if you want to make that attribute optionnal and match all <tr> elements that at least have the data-people_id attribute, you can use this REGEX string:
Code: Select all
<tr class=\\"(even|odd)\\" data-people_id=\\"\d+\\"( data-team_id=\\"\d+\\")?>
I tried this in Regexr, and for both URLs, it matched 15 instances :)

Image
Image