Page 1 of 1

How to grab specific element to value

Posted: Fri Jan 02, 2015 7:38 pm
by KraZy
I've this string value:
Code: Select all
<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>
I want to delete all and take only Inter; This is just a sample value, P. Osvaldo is not always the same, then I would find a general algorithm.
I find this regex on the internet:
Code: Select all
(?<=>)[^>]*(?=</a>$)
but it doesn't work
Code: Select all
 .Giocatore Regex.Match (Content, "<td class =" "large player-link" "\ s *> (. +?) </td>").Groups(1).ToString
this variable contains a value grab in a web page, the contenent grabbed is
Code: Select all
<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>
so i want elaborate all pattern

Re: How to grab specific element to value

Posted: Fri Jan 02, 2015 8:39 pm
by mandai
If you parse the XML you could use this code to get the title attribute:
Code: Select all
    Private Sub btnTest_Click(sender As System.Object, e As System.EventArgs) Handles btnTest.Click

        Dim input As String = "<a href=""/teams/italy/fc-internazionale-milano/1244/"" title=""Inter"">Inter</a>"""
        Dim inputReader As StringReader = New StringReader(input)

        Dim inputXml As XmlTextReader = New XmlTextReader(inputReader)
        inputXml.Read()

        Dim title As String = inputXml.GetAttribute("title")

        MsgBox(title)

    End Sub

Re: How to grab specific element to value

Posted: Fri Jan 02, 2015 9:01 pm
by visualtech
Hey! I can propose a solution, but it can be a little cumbursome; but rest assured, you'll get the results! This solutions is based on a really simple logic without REGEX.
Code: Select all

// Considering you have the string stored in this variable
String s = "<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>"

// Now, we'll iterate over each char till we get the ">", which means the first part's ending. 

int offset = 0; // Stores the location of the "<"

for (int i = 0; i < s.length; i++) {
  if (s[i] == '>' && s[i - 1] == '\"') {
    offset++;
  }
}

s = s.substring(offset); // Now we have removed the anchor's opening tag, the closing one remains constant.

s = s.replace("</a>", "");

// Now, "s" should contain the value.
MessageBox.Show(s);

Damn this internet! #mandai beat me! :P

Re: How to grab specific element to value

Posted: Sat Jan 03, 2015 1:15 am
by SumCode
If you want to use regex this will work: >(\w+)<\/a>

And just capture the first group

Re: How to grab specific element to value

Posted: Sat Jan 03, 2015 9:08 am
by KraZy
Thanks at all, good solutions.

Re: How to grab specific element to value

Posted: Sat Jan 03, 2015 11:41 am
by AnoPem
Code: Select all
<(.*)>(.*)<(\/.*)>
Will split the string so that the second value vill be inter

Re: How to grab specific element to value

Posted: Sat Jan 03, 2015 11:59 am
by KraZy
A last question, i've fixed the problem with one of our solution, but now I want understand because this regex not function with some link:
Code: Select all
 Dim Table_Rgx As MatchCollection = Regex.Matches(Code, "<tr class=""(odd|even)"" data-people_id=""\d+"" data-team_id=""(\d+)"">(.+?)</tr>", RegexOptions.Singleline)
I valorize the "Code" variable with a link, in particulare for this link:

http://it.soccerway.com/a/block_competi ... %22%3A1%7D

working good and I haven't problem, but for this link:

http://it.soccerway.com/a/block_competi ... %22%3A2%7D

seems that the Table_Rgx variables doesn't valorized with occurence.
I use in the future this variable to make a loop and monetize other variable.

What's wrong? dunnno;

Re: How to grab specific element to value

Posted: Sat Jan 03, 2015 1:25 pm
by comathi
What value are you trying to isolate exactly? It seems to me like the code you just posted won't isolate something along the lines of the first example you posted in the original post :?

Re: How to grab specific element to value

Posted: Sat Jan 03, 2015 2:01 pm
by KraZy
If you check the links that I previously posted you find this class:
Code: Select all
<tr class=\"odd\" data-people_id=\"122\" data-team_id=\"1242\">
If you try the regex with the first link you can see that the regex function, but with the second link the regex doesn't function, so the table.. variables doesn't valorized.

Re: How to grab specific element to value

Posted: Sat Jan 03, 2015 2:59 pm
by comathi
In the second link you provided, none of the <tr> elements have a data-team_id attribute, so the REGEX won't match them. However, if you want to make that attribute optionnal and match all <tr> elements that at least have the data-people_id attribute, you can use this REGEX string:
Code: Select all
<tr class=\\"(even|odd)\\" data-people_id=\\"\d+\\"( data-team_id=\\"\d+\\")?>
I tried this in Regexr, and for both URLs, it matched 15 instances :)

Image
Image