1

Topic: HTML parser wanted!

Friends! I search for a subject for S.Na's language an input html page with the code. On an output the list of all URL from this page.  scripts it is not necessary. There is a such?

2

Re: HTML parser wanted!

Hello, Poseidon, you wrote: P> Friends! I search for a subject for language  P> On an input html page with the code. P> on an output the list of all URL from this page.  scripts it is not necessary. P> there is a such? Here was wrote approximately for such purposes once: https://www.codeproject.com/Articles/14 … -Tokenizer Though this.h a file and for a C ++, but  on a C elementarily. The variation of this is used in Sciter.

3

Re: HTML parser wanted!

Hello, , you wrote: > 2) the regular expressions About! Precisely!

4

Re: HTML parser wanted!

Hello, , you wrote: "in well-formed xml it is possible..." Aha: <a href ='foo.bar '> <! [CDATA [<a href = "bar.foo">... </a>]]> </a> And here here "if it is careful and with the tolerance more to the right of the sun" it was possible and not to be strained and reduce to two bast shoes to "perhaps", "" or simply "type"...

5

Re: HTML parser wanted!

Hello, Poseidon, you wrote: P> Friends! I search for a subject for S.Pishi's language on Go. There is. https://godoc.org/golang.org/x/net/html

6

Re: HTML parser wanted!

Hello, Pzz, you wrote: Pzz> Hello, Poseidon, you wrote: P>> Friends! I search for a subject for language  Pzz> Write on Go. There is. https://godoc.org/golang.org/x/net/html it is better on PHP http://simplehtmldom.sourceforge.net/ And and so such is https://github.com/lexborisov/myhtml

7

Re: HTML parser wanted!

Hello, kov_serg, you wrote: P>>> Friends! I search for a subject for language  Pzz>> Write on Go. There is. https://godoc.org/golang.org/x/net/html _> Compiler Go is better on PHP http://simplehtmldom.sourceforge.net/ produces on an output statically assembled executed file for which generally it is necessary nothing for life. PHP, as far as I know, so is not able. Besides, Go - the modern, pleasant and convenient language of family of a C, with static check by check of types and everyone such. Simple and clear,  it will be mastered with it for a week. The machine code producing on an output, instead of interpreted.

8

Re: HTML parser wanted!

Hello, Poseidon, you wrote: P> Friends! I search for a subject for language  P> On an input html page with the code. P> on an output the list of all URL from this page.  scripts it is not necessary. P> there is a such?  - https://github.com/google/gumbo-parser

9

Re: HTML parser wanted!

Hello, Pzz, you wrote: Pzz>>> Write on Go. There is. https://godoc.org/golang.org/x/net/html _>> Compiler Go is better on PHP http://simplehtmldom.sourceforge.net/ Pzz> produces on an output statically assembled executed file for which generally it is necessary nothing for life. PHP, as far as I know, so is not able. I meant that  html better a script which easily to correct, in difference from . Since html has  to change in due course. So php, lua, perl it is differently better than .

10

Re: HTML parser wanted!

Hello, kov_serg, you wrote: _> I meant that  html better a script which easily to correct, in difference from . Since html has  to change in due course. _> so php, lua, perl it is differently better than . Well, at first,  for which html, actually, also it is intended - at all scripts, and at all even . Secondly, if something starts not to grow together, some time all the same is required to understand. On this background   - business simple. And thirdly, semantics of tags a little changes and appear new, but syntax changes a little.

11

Re: HTML parser wanted!

Hello, Pzz, you wrote: Pzz> Well, at first,  for which html, actually, also it is intended - at all scripts, and at all even . Secondly, if something starts not to grow together, some time all the same is required to understand. On this background   - business simple. And thirdly, semantics of tags a little changes and appear new, but syntax changes a little. Normally from page it is necessary to take out the necessary information ( the text, links, prices, statistics, etc.). But also design of pages and their marking can change in due course. And easier to change the text file than  . Moreover it is possible these scripts is easier in the monitoring system of versions.

12

Re: HTML parser wanted!

Hello, kov_serg, you wrote: _> Normally from page it is necessary to take out the necessary information ( the text, links, prices, statistics, etc.). But also design of pages and their marking can change in due course. And easier to change the text file than  . Moreover it is possible these scripts is easier in the monitoring system of versions. So it is not necessary to put  in the monitoring system of versions. It is necessary to put their source codes.

13

Re: HTML parser wanted!

Hello, Pzz, you wrote: Pzz> So it is not necessary to put  in the monitoring system of versions. It is necessary to put their source codes. . Normally part which collects given with sites is only small share of source codes and  all for the sake of this nonsense as, not .