How to grab specific element to value

If you need help with a project or need to know how to do something specific in VB.NET then please ask your questions in here.
Forum rules
Please LOCK your topics once you have found the solution to your question so we know you no longer require help with your query.
10 posts Page 1 of 1
Contributors
User avatar
KraZy
Top Poster
Top Poster
Posts: 93
Joined: Sat May 26, 2012 8:40 am

How to grab specific element to value
KraZy
I've this string value:
Code: Select all
<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>
I want to delete all and take only Inter; This is just a sample value, P. Osvaldo is not always the same, then I would find a general algorithm.
I find this regex on the internet:
Code: Select all
(?<=>)[^>]*(?=</a>$)
but it doesn't work
Code: Select all
 .Giocatore Regex.Match (Content, "<td class =" "large player-link" "\ s *> (. +?) </td>").Groups(1).ToString
this variable contains a value grab in a web page, the contenent grabbed is
Code: Select all
<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>
so i want elaborate all pattern
I'm in the empire business.
User avatar
mandai
Coding God
Coding God
Posts: 2585
Joined: Mon Apr 26, 2010 6:51 pm

If you parse the XML you could use this code to get the title attribute:
Code: Select all
    Private Sub btnTest_Click(sender As System.Object, e As System.EventArgs) Handles btnTest.Click

        Dim input As String = "<a href=""/teams/italy/fc-internazionale-milano/1244/"" title=""Inter"">Inter</a>"""
        Dim inputReader As StringReader = New StringReader(input)

        Dim inputXml As XmlTextReader = New XmlTextReader(inputReader)
        inputXml.Read()

        Dim title As String = inputXml.GetAttribute("title")

        MsgBox(title)

    End Sub
User avatar
visualtech
VIP - Donator
VIP - Donator
Posts: 265
Joined: Sat Nov 19, 2011 2:19 pm

Hey! I can propose a solution, but it can be a little cumbursome; but rest assured, you'll get the results! This solutions is based on a really simple logic without REGEX.
Code: Select all

// Considering you have the string stored in this variable
String s = "<a href="/teams/italy/fc-internazionale-milano/1244/" title="Inter">Inter</a>"

// Now, we'll iterate over each char till we get the ">", which means the first part's ending. 

int offset = 0; // Stores the location of the "<"

for (int i = 0; i < s.length; i++) {
  if (s[i] == '>' && s[i - 1] == '\"') {
    offset++;
  }
}

s = s.substring(offset); // Now we have removed the anchor's opening tag, the closing one remains constant.

s = s.replace("</a>", "");

// Now, "s" should contain the value.
MessageBox.Show(s);

Damn this internet! #mandai beat me! :P
Image
User avatar
SumCode
Dedicated Member
Dedicated Member
Posts: 57
Joined: Fri Aug 03, 2012 2:34 am

If you want to use regex this will work: >(\w+)<\/a>

And just capture the first group
User avatar
KraZy
Top Poster
Top Poster
Posts: 93
Joined: Sat May 26, 2012 8:40 am

Thanks at all, good solutions.
I'm in the empire business.
User avatar
AnoPem
VIP - Donator
VIP - Donator
Posts: 441
Joined: Sat Jul 24, 2010 10:55 pm

Code: Select all
<(.*)>(.*)<(\/.*)>
Will split the string so that the second value vill be inter
https://t.me/pump_upp
User avatar
KraZy
Top Poster
Top Poster
Posts: 93
Joined: Sat May 26, 2012 8:40 am

A last question, i've fixed the problem with one of our solution, but now I want understand because this regex not function with some link:
Code: Select all
 Dim Table_Rgx As MatchCollection = Regex.Matches(Code, "<tr class=""(odd|even)"" data-people_id=""\d+"" data-team_id=""(\d+)"">(.+?)</tr>", RegexOptions.Singleline)
I valorize the "Code" variable with a link, in particulare for this link:

http://it.soccerway.com/a/block_competi ... %22%3A1%7D

working good and I haven't problem, but for this link:

http://it.soccerway.com/a/block_competi ... %22%3A2%7D

seems that the Table_Rgx variables doesn't valorized with occurence.
I use in the future this variable to make a loop and monetize other variable.

What's wrong? dunnno;
I'm in the empire business.
User avatar
comathi
Coding God
Coding God
Posts: 1242
Joined: Fri Mar 26, 2010 1:59 pm

What value are you trying to isolate exactly? It seems to me like the code you just posted won't isolate something along the lines of the first example you posted in the original post :?
User avatar
KraZy
Top Poster
Top Poster
Posts: 93
Joined: Sat May 26, 2012 8:40 am

If you check the links that I previously posted you find this class:
Code: Select all
<tr class=\"odd\" data-people_id=\"122\" data-team_id=\"1242\">
If you try the regex with the first link you can see that the regex function, but with the second link the regex doesn't function, so the table.. variables doesn't valorized.
I'm in the empire business.
User avatar
comathi
Coding God
Coding God
Posts: 1242
Joined: Fri Mar 26, 2010 1:59 pm

In the second link you provided, none of the <tr> elements have a data-team_id attribute, so the REGEX won't match them. However, if you want to make that attribute optionnal and match all <tr> elements that at least have the data-people_id attribute, you can use this REGEX string:
Code: Select all
<tr class=\\"(even|odd)\\" data-people_id=\\"\d+\\"( data-team_id=\\"\d+\\")?>
I tried this in Regexr, and for both URLs, it matched 15 instances :)

Image
Image
10 posts Page 1 of 1
Return to “Coding Help & Support”