Page 1 of 1

Using Regex to find elements?

Posted: Wed Oct 15, 2014 3:18 am
by SumCode
So when a user enters some text into a textbox I want to find matches with this pattern '[A-Z][a-z]?\d+'
So if a user entered "Hg2H1' it would split it to 'Hg2' and 'H1' and it does. However, if a user were to enter 'Zn4Pb15H2O' the program would split it to 'Zn4' 'Pb15' and 'H2O' which isn't desirable. I would want it to split to 'Zn4' 'Pb1' and '5H2O'. Any help is appreciated, thanks!

Re: Using Regex to find elements?

Posted: Wed Oct 15, 2014 4:41 am
by CodenStuff

Re: Using Regex to find elements?

Posted: Wed Oct 15, 2014 11:49 am
by comathi
The problem is that you can't know whether the user meant Pb15 or Pb1 + 5H2O... as these are two input possibilities.

Edit: Also, I may be wrong, but don't coefficients exist only when typing out a chemical equation? In the case of a chemical formula, I don't think you'll encounter coefficients, just subscripts.

Re: Using Regex to find elements?

Posted: Wed Oct 15, 2014 11:49 pm
by SumCode
comathi wrote:
The problem is that you can't know whether the user meant Pb15 or Pb1 + 5H2O... as these are two input possibilities.

Edit: Also, I may be wrong, but don't coefficients exist only when typing out a chemical equation? In the case of a chemical formula, I don't think you'll encounter coefficients, just subscripts.
Yea so I just made it so the user has to enter '*[amt of water]H2O' and I made some custom regex patterns and came out with this code here
Code: Select all
    String useWaters = equationTB.Text.Contains("H2O") ? Regex.Match(equationTB.Text, @"\*\d+(H2O)").Value : "";
    String equation = useWaters != "" ? equationTB.Text.Replace(useWaters, "") : equationTB.Text;
    MatchCollection allMatches = Regex.Matches(equation, @"\(?[A-Z][a-z]?\d+\)?\d*");
    for (int i = 0; i <= allMatches.Count - 1; i++)
        String symbol, amount = "";
        String value = allMatches[i].Value;
        watersAmount.Text = useWaters != "" ? Regex.Match(useWaters, @"\d+").Value : "0";
        symbol = Regex.Match(value, "[A-Z][a-z]?").Value;
        if (value.Contains('('))
            amount = Convert.ToString(Convert.ToInt32(Regex.Match(value, @"\d+").Value) * Convert.ToInt32(allMatches[i].NextMatch().Value.Split(')')[1]));
        else if (value.Contains(')'))
            amount = Convert.ToString(Convert.ToInt32(value.Split(')')[1]) * Convert.ToInt32(value.Split(')')[0].Replace(symbol, "")));
            amount = Regex.Match(value, @"\d+").Value;
        switch (i)
            case 0:
                firstElement.SelectedIndex = allElements.FindIndex(k => k.elementSymbol == symbol);
                firstAmount.Text = amount;
            case 1:
                secondElement.SelectedIndex = allElements.FindIndex(k => k.elementSymbol == symbol);
                secondAmount.Text = amount;
            case 2:
                thirdElement.SelectedIndex = allElements.FindIndex(k => k.elementSymbol == symbol);
                thirdAmount.Text = amount;
            case 3:
                fourthElement.SelectedIndex = allElements.FindIndex(k => k.elementSymbol == symbol);
                fourthAmount.Text = amount;
            case 4:
                fifthElement.SelectedIndex = allElements.FindIndex(k => k.elementSymbol == symbol);
                fifthAmount.Text = amount;
So if I were to enter an equation like this 'Pb2(Hg1O2)1He1*2H2O' the program would see it as
Pb - 2
Hg - 1
O - 2
H2O - 2
And end-result is

Also just in case anyone is taking chemistry like me, here's the source