25 May
2010
25 May
'10
15:33
I need to extract text from html for purposes of indexing - implementation language is C or C++
I would use a SAX parser that handles HTML (libxml2?). Then all you might need to do is handle the TEXT nodes. Cheers Justin -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org