[opensuse] saving web pages in print-format
Hello: Several web sites offer their pages for viewing them in the browser and for printing. Eg. at "http://www.mno.hu/portal/711171" clicking "Nyomtatható verzió" at the end of the article opens a new tab/window with the print version of the article. What is important for me that I can save this page as an html page using the browser's save function. Other sites direct the content to the printer when you click the print version. Eg. at "http://nol.hu/belfold/13_1__mit_titkol_borokai_gabor_" clicking on the printer icon opens the printer dialog. But I would like to save the print version as an html page in such cases as well. Is it possible somehow? Thank you in advance, Istvan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Tue, May 4, 2010 at 8:41 PM, Istvan Gabor <suseuser04@lajt.hu> wrote:
Hello:
Several web sites offer their pages for viewing them in the browser and for printing.
Eg. at "http://www.mno.hu/portal/711171" clicking "Nyomtatható verzió" at the end of the article opens a new tab/window with the print version of the article. What is important for me that I can save this page as an html page using the browser's save function.
It is the same of the article (HTML) without all the fluff and thus you can do "save as" in HTML format.
Other sites direct the content to the printer when you click the print version. Eg. at "http://nol.hu/belfold/13_1__mit_titkol_borokai_gabor_" clicking on the printer icon opens the printer dialog.
Whereas, this invokes the "CTRL-P" action. You can either print to paper or PS or PDF and use a filter to convert to HTML.
But I would like to save the print version as an html page in such cases as well. Is it possible somehow?
pdf2html? <http://pdftohtml.sourceforge.net/> this might be a fit for your requirement (I have not used it myself) -- Arun Khan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2010-05-04 17:11, Istvan Gabor wrote:
Hello:
Several web sites offer their pages for viewing them in the browser and for printing.
Eg. at "http://www.mno.hu/portal/711171" clicking "Nyomtatható verzió" at the end of the article opens a new tab/window with the print version of the article. What is important for me that I can save this page as an html page using the browser's save function.
Other sites direct the content to the printer when you click the print version. Eg. at "http://nol.hu/belfold/13_1__mit_titkol_borokai_gabor_" clicking on the printer icon opens the printer dialog. But I would like to save the print version as an html page in such cases as well. Is it possible somehow
It is not only dependent on the browser, but also on the particular web page. The moment you click on "Print", you actually get a new page, which can have different properties and limitations. Sometimes I select (highlight) all the text on the browser, and then copypaste to openoffice. - -- Cheers / Saludos, Carlos E. R. (from 11.2 x86_64 "Emerald" GM (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iF4EAREIAAYFAkvhFXAACgkQja8UbcUWM1zBUAD9FsRx7ge+SylsWfhYMaESZyvO qpERqavqdGiZisQnZtgA/1a/aClSsUFpMRJn96A4eNSOt5osNShOQJW37+ADXCw5 =6ypV -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello, On May 4 17:11 Istvan Gabor wrote (shortened):
Several web sites offer their pages for viewing them in the browser and for printing. ... I would like to save the print version as an html page
I am really not a web content expert but I think that nowadays the layout of the plain HTML (i.e. how the plain HTML is shown on the screen or formatted for printing) is often defined via two different CSS style sheets for viewing and printing so that saving only the plain HTML may leave out what defines the layout. Furthermore saving only the plain HTML will leave out all images. If you like to save what you see when viewing and printing it, I think it is better to let the browser save it as PostScript or preferably as PDF file. Of course if you need only the actual plain information what is in the HTML, you can save the plain HTML. Alternatively you may try out a web downloader like wget to download the plain HTML together with all the files that are necessary to properly display a given HTML page e.g. via "wget ... --page-requisites", see "man wget": -------------------------------------------------------------- Actually, to download a single page and all its requisites (even if they exist on separate websites), and make sure the lot displays properly locally, this author likes to use a few options in addition to -p: wget -E -H -k -K -p http://<site>/<document> -------------------------------------------------------------- But I have no idea how to tell wget to download the version for printing of a html page - unless it has an explicite URL. Bottom line: Your simple question "save a html page" leads to a more and more complicated answer which depends on what you actually mean with your question. Kind Regards Johannes Meixner -- SUSE LINUX Products GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany AG Nuernberg, HRB 16746, GF: Markus Rex -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
2010. május 5. 9:46 napon Johannes Meixner <jsmeix@suse.de> írta:
Hello,
On May 4 17:11 Istvan Gabor wrote (shortened):
Several web sites offer their pages for viewing them in the browser and for printing. ... I would like to save the print version as an html page
I am really not a web content expert but I think that nowadays the layout of the plain HTML (i.e. how the plain HTML is shown on the screen or formatted for printing) is often defined via two different CSS style sheets for viewing and printing so that saving only the plain HTML may leave out what defines the layout.
Furthermore saving only the plain HTML will leave out all images.
If you like to save what you see when viewing and printing it, I think it is better to let the browser save it as PostScript or preferably as PDF file.
Of course if you need only the actual plain information what is in the HTML, you can save the plain HTML.
Alternatively you may try out a web downloader like wget to download the plain HTML together with all the files that are necessary to properly display a given HTML page e.g. via "wget ... --page-requisites", see "man wget": -------------------------------------------------------------- Actually, to download a single page and all its requisites (even if they exist on separate websites), and make sure the lot displays properly locally, this author likes to use a few options in addition to -p: wget -E -H -k -K -p http:/// --------------------------------------------------------------
But I have no idea how to tell wget to download the version for printing of a html page - unless it has an explicite URL.
Bottom line: Your simple question "save a html page" leads to a more and more complicated answer which depends on what you actually mean with your question.
Arun, Johannes, thank you for your answers. What I though was that even in the second example I gave the web-page sent to the printer is the same plain html stuff as in the first example except of some command/script makes the browser print it directly instead of open it in a window. I thought that I could apply some hack or trick so that the given page - instead of sent to the printer command - would be opened in a new window as in the first example. Then I can save that window as html. I know wget but it would be to difficult to tell wget the exact page I want to download if it is possible at all. As for printing to pdf/ps, I do it sometimes but it is very tedious. Opera web browser's print layout is very ugly, not really usable. In firefox 3 and seamonkey 2 it is a pain to print to file as they can not remember user's settings like headers/footers, the last directory used, it is not possible to select a file in save window and rename it. In firefox 2 I could direct print job to kprinter easily (in KDE 3.5); it was the best solution so far, though still not as straightforward as saving as html page. Once more, thank you for your help. Istvan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Istvan Gabor wrote:
What I though was that even in the second example I gave the web-page sent to the printer is the same plain html stuff as in the first example except of some command/script makes the browser print it directly instead of open it in a window. I thought that I could apply some hack or trick so that the given page - instead of sent to the printer command - would be opened in a new window as in the first example. Then I can save that window as html.
Not sure if I understood correctly but when I click to the link you mentioned as open as a new tab and then choose to view source I see the html and I can save it. But as I said maybe I am not following you correctly HTH Togan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2010-05-05 10:40, Istvan Gabor wrote:
What I though was that even in the second example I gave the web-page sent to the printer is the same plain html stuff as in the first example except of some command/script makes the browser print it directly instead of open it in a window. I thought that I could apply some hack or trick so that the given page - instead of sent to the printer command - would be opened in a new window as in the first example. Then I can save that window as html.
What about using print preview, in the file menu? - -- Cheers / Saludos, Carlos E. R. (from 11.2 x86_64 "Emerald" GM (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iF4EAREIAAYFAkvhhAsACgkQja8UbcUWM1xRxwD8DrdPLStSZ1KT7UrZcTmgZcYU JkcQ9Iv42tlY+yXr1y8A/2R9vWUtTWesfNIbQTfsj4svTBd+BmX8SpbVR1SNl0sD =lWyj -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (5)
-
Arun Khan
-
Carlos E. R.
-
Istvan Gabor
-
Johannes Meixner
-
Togan Muftuoglu