Thread: Need help with Regex

Results 1 to 7 of 7
  1. #1 Need help with Regex 
    Ed
    Ed is offline
    AKA Edvinas
    Ed's Avatar
    Join Date
    Jun 2009
    Age
    28
    Posts
    4,504
    Thanks given
    523
    Thanks received
    512
    Rep Power
    2659
    Well it's not that much in Regex that I have problems with, but the way it outputs the found pattern.

    /inb4 Michael comes, this is in C# :coolface:


    Here's a simple program i quickly wrote as an example of what I have in mind.

    Code:
    using System;
    using System.Collections.Generic;
    using System.ComponentModel;
    using System.Data;
    using System.Drawing;
    using System.Linq;
    using System.Text;
    using System.Windows.Forms;
    using System.Text.RegularExpressions;
    namespace WindowsFormsApplication19
    {
    publicpartialclassForm1 : Form
    {
    public Form1()
    {
    InitializeComponent();
    }
    privatevoid Form1_Load(object sender, EventArgs e)
    {
    
    string source = "<li class='colorbutton'><div style='background-color:DarkSlateBlue'>Hello</div></li>" + Environment.NewLine + 
    "<li class='colorbutton'><div style='background-color:DarkSlateBlue'>Hello1</div></li>"+ Environment.NewLine +
    "<li class='colorbutton'><div style='background-color:DarkSlateBlue'>Hello2</div></li>"
    ;
    Regex r1 = newRegex("<li class='colorbutton'><div style='background-color:DarkSlateBlue'>(.*?)</div></li>", RegexOptions.Singleline);
    
    for (Match m1 = r1.Match(source); m1.Success; m1 = m1.NextMatch())
    {
    
    
    textBox1.Text += "[Start]" + Environment.NewLine +
    "The string is: " + m1.Groups[1] + Environment.NewLine +
    "[End]" + Environment.NewLine;
    }
    }
    
    }
    }
    Basically, you see in the example I have string called "source" which consists of 3 lines which are almost identical. Notice how one is has word "Hello", other one "Hello1" and last one "Hello2":

    Code:
    <li class='colorbutton'><div style='background-color:DarkSlateBlue'>Hello</div></li>" + Environment.NewLine + 
    "<li class='colorbutton'><div style='background-color:DarkSlateBlue'>Hello1</div></li>"+ Environment.NewLine +
    "<li class='colorbutton'><div style='background-color:DarkSlateBlue'>Hello2</div></li>
    Notice that everything there is the same except for those Hello parts.
    My sample program finds those Hello parts using regex like this:
    Code:
    Regex r1 = new Regex("<li class='colorbutton'><div style='background-color:DarkSlateBlue'>(.*?)</div></li>", RegexOptions.Singleline);
    and then puts it in a textbox with this:
    Code:
    textBox1.Text += "[Start]" + Environment.NewLine +
    "The string is: " + m1.Groups[1] + Environment.NewLine +
    "[End]" + Environment.NewLine;
    I'm not complaining, it works just fine. After running the program, the output is:

    Code:
    [Start]
    The string is: Hello
    [End]
    [Start]
    The string is: Hello1
    [End]
    [Start]
    The string is: Hello2
    [End]
    And that is basically where I need help...
    I don't want it to output like that. The pattern is the same, only the "Hello" part is different, so I want it to simply output like this:

    Code:
    [Start]
    The string is: Hello, Hello1, Hello2
    [End]
    But what ever I try, I just can't achieve it... I know it's probably basic stuff but it just doesn't work for me
    In PHP I'd simply use "foreach" because I done exactly same thing on PHP and it worked fine using foreach but I tried it and it didn't work with C#

    Can someone help me please? :/

    Thanks
    Reply With Quote  
     

  2. #2  
    Community Veteran


    Join Date
    Jan 2008
    Posts
    2,664
    Thanks given
    493
    Thanks received
    627
    Rep Power
    980
    I don't completely understand what the full function of this program is supposed to be, but obviously the output will be like that, because you have it printing start/end etc for every match. I can change it to how you would like, but can I see the PHP code first?
    ~iKilem
    Reply With Quote  
     

  3. #3  
    Ed
    Ed is offline
    AKA Edvinas
    Ed's Avatar
    Join Date
    Jun 2009
    Age
    28
    Posts
    4,504
    Thanks given
    523
    Thanks received
    512
    Rep Power
    2659
    The function of the sample program above is to find the pattern in specified string and output the part marked as (.+?).
    That was just an example, normally I'll be searching for the pattern in one website source, but it's almost same thing, just that difference string where to search.

    Here's the PHP one. It's not exactly the same but it is same principle. This finds the hottest google trends, which are same as what I have in my example - one per line and then prints out all of them together in one line.

    Code:
    <?php
    $newtrends = array();
    $newtrendsclean = array();
    $outputstring = "";
    $logdata = "";
    
    $url = 'http://www.google.com/trends/hottrends/atom/hourly';
    $ch = curl_init();
    $timeout = SCRAPE_TIMEOUT;
    curl_setopt($ch,CURLOPT_URL,$url);
    curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
    $page = curl_exec($ch);
    curl_close($ch);
    
    preg_match_all('(<a href="(.+)">(.*)</a>)siU', $page, $matches);
    foreach($matches[2] as $trend){
    	$trendas = ", $trend";
    	$trend = addslashes($trend);
    	$trend = substr($trend,0,99);
    echo "$trendas";
    	}
    
    ?>
    The trends are originally like this on the website:
    Code:
    <li><span class="Volcanic equal"><a href="http://www.google.com/trends/hottrends?q=chad+jones&date=2010-6-25&sa=X">chad jones</a></span></li> 
    <li><span class="On_Fire equal"><a href="http://www.google.com/trends/hottrends?q=peggy+west&date=2010-6-25&sa=X">peggy west</a></span></li> 
    <li><span class="Spicy up4"><a href="http://www.google.com/trends/hottrends?q=taste+of+chicago+2010&date=2010-6-25&sa=X">taste of chicago 2010</a></span></li> 
    <li><span class="Spicy equal"><a href="http://www.google.com/trends/hottrends?q=kate+gosselin+botox&date=2010-6-25&sa=X">kate gosselin botox</a></span></li> 
    <li><span class="Spicy down2"><a href="http://www.google.com/trends/hottrends?q=jobbie+nooner+2010&date=2010-6-25&sa=X">jobbie nooner 2010</a></span></li> 
    <li><span class="Spicy equal"><a href="http://www.google.com/trends/hottrends?q=nba+draft+grades&date=2010-6-25&sa=X">nba draft grades</a></span></li>
    in other words, one per line just like in my example, and the PHP code outputs the trends all in one line like this:
    Code:
    , chad jones, peggy west, taste of chicago 2010, kate gossellin botox etc.
    Reply With Quote  
     

  4. #4  
    Community Veteran


    Join Date
    Jan 2008
    Posts
    2,664
    Thanks given
    493
    Thanks received
    627
    Rep Power
    980
    Change this
    Code:
    for (Match m1 = r1.Match(source); m1.Success; m1 = m1.NextMatch())
    {
        textBox1.Text += "[Start]" + Environment.NewLine +
        "The string is: " + m1.Groups[1] + Environment.NewLine +
        "[End]" + Environment.NewLine;
    }
    to this
    Code:
    textBox1.Text += "[Start]" + Environment.NewLine + "The string is:";
    for (Match m1 = r1.Match(source); m1.Success; m1 = m1.NextMatch())
    {
        textBox1.Text += " " + m1.Groups[1];
    }
    textBox1.Text += Environment.NewLine + "[End]" + Environment.NewLine;

    ps sorry for late reply (dinner )
    ~iKilem
    Reply With Quote  
     

  5. #5  
    Ed
    Ed is offline
    AKA Edvinas
    Ed's Avatar
    Join Date
    Jun 2009
    Age
    28
    Posts
    4,504
    Thanks given
    523
    Thanks received
    512
    Rep Power
    2659
    hehe its ok.
    Let me try that now and post the result
    Reply With Quote  
     

  6. #6  
    Ed
    Ed is offline
    AKA Edvinas
    Ed's Avatar
    Join Date
    Jun 2009
    Age
    28
    Posts
    4,504
    Thanks given
    523
    Thanks received
    512
    Rep Power
    2659
    works as a charm, cheers
    Reply With Quote  
     

  7. #7  
    Banned

    Join Date
    Feb 2009
    Posts
    1,676
    Thanks given
    24
    Thanks received
    25
    Rep Power
    0
    hey did i ever tell you to indent?
    Reply With Quote  
     


Thread Information
Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)


User Tag List

Tags for this Thread

View Tag Cloud

Posting Permissions
  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •