Home Geschichten Kunst Computer Tindertraum

[current]

I'm having a real scripter-block here.
(Monday 25th February 2002)

As a long-time Perl scripter I'm used to doing string manipulation with regex. I've read and understood Mastering Regular Expressions to a degree that I've written a rather nice XML-Parser in pure Perl.

But today I have a problem I can't solve with regex.

I have the following kind of tags in my project:
<span ptal:content="hello">
    some text
</span>

no great problem there:
"/<((\w+)[^>]*)\s+ptal:content=\"([^\"]+)\"([^>]*)>(.*?)<\/\\2>/"
will match the whole thing

But now comes the challenge:
<span ptal:define="say hello">
  <span ptal:content="say">
      some text
  </span>
</span>


The problem is that any kind of above regex will match the yellow part like so:
<span ptal:define="say hello">
  <span ptal:content="say">
      some text
  </span>

</span>

as it will take the first opening-tag and match until it finds an matching closing tag, disregarding any nesting.

And converting the regex to be greedy is no solution, as it would then match the first <span> to the very last </span>...

I guess some programmatic string-parsing is called for here... Darn if only I could do that...

Some poiners I found:

[ by Martin>] [permalink] [similar entries]

similar entries (vs):

similar entries (cg):

relevant words



Martin Spernau
© 1994-2003

traumwind icon Big things to come (TM) 30th Dez 2002

Tidy up
Oblique Strategies, Ed.3 Brian Eno and Peter Schmidt



amazon.de Wunschliste





 

usefull links:
Google Graph browser
Traumwind 6-Colormatch
UAV News

powered by SBELT