Save html page as text

looking to use javascript to save a webpage as a text file

it’s a sports site with lots of text, i will be sorting through the text for interests later. the only code that i have come across is:

    Open a text stream for the file sport.txt 

For security reasons, you’re never going to get (reliable, cross-browser) access to the user’s file system. Your best bet is to spawn a separate page with just the text and provide a download option.

The example is a bit misleading, especially the line with a reference to the file system. If he wants to just render a text file, we’ll need to know more about his backend.

2 Answers 2

If you want to write your own utility script that grabs the content of a page and downloads it to a file, and you want to write it in JavaScript, you could use Node.

If you just need a command-line tool to do that, use wget.

Both these options run on many platforms.

The code you post do nothing, as it’s not a valid JS code. And with so unclear question, the answer may not what you ask for.

I’m not sure what you really want to save, the entire page source or just visible text that the browser render. Also you not specify in which environment will run your script, is it in a web browser or WSH?

I’ll post example code for both cases (page source/text). I’ll do my best to write in JScript at least one of them. However, it’s more easy to me to write in VBScript, and as you said that’s not a problem, my second example code will be in VBS.

To get html source code (.JS):

var url = 'http://some.url'; // set your page url here with (new ActiveXObject("Microsoft.XmlHttp")) < open('GET', url, false); send(''); var data = responseText; with (new ActiveXObject("ADODB.Stream")) < Open(); Type = 2; // adTypeText Charset = 'utf-8'; // specify correct encoding WriteText(data); SaveToFile("page.html", 2); Close(); >> 

To get visible/render text (.VBS):

Dim url: url = "http://some.url" 'set your page url here' With WScript.CreateObject("InternetExplorer.Application", "IE_") .Visible = False .Navigate url Do WScript.Sleep 100 Loop While .ReadyState < 4 And .Busy Dim data: data = .Document.Body.innerText With CreateObject("ADODB.Stream") .Open .Type = 2 'adTypeText' .Position = 0 .Charset = "utf-8" .WriteText data .SaveToFile "output.txt", 2 .Close End With .Quit End With 

Источник

Читайте также:  Php class var types

How to save (changed) content of local HTML as text file?

I am trying to do something specific and I am not sure it is possible. I have seen it answered how to do with a textarea but not as I am asking. My code loads a text file and sets a bullet list based on the content. The bullet list is both able to be sorted (drag & drop) using JQuery-UI and edited with contenteditable="true" . Here is the code that accomplishes this:

#sortable1, #sortable2 < list-style-type: none; margin: 0; padding: 0; zoom: 1; >#sortable1 li, #sortable2 li < margin: 0 5px 5px 20px; padding: 3px; width: 90%; >.ui-state-disabled li < color: green; margin: 0 5px 5px 0px; >.ui-state-default li
FAVORITE SITES < !Variable=eBay !Variable=Google >AVOIDED SITES < !Variable=Yahoo !Variable=CraigsList >OTHER SITES < !Variable=Alexa !Variable=Amazon !Variable=Jet >MORE SITES

And the bullet list is outputted as a result. Using the sort and the edit I can re-arrange the list and edit/add bullets (you can see if you test snippet and save the text example to be loaded). The question is, after I make sorts and edits is there a way to save the changes back to the text file? I know there are limitations but it could even be saved as a new file or with a prompt so I can choose the save location and overwrite the old file. But is this possible and any idea on how to save the HTML edits? I suppose there are two functions needed. One would be to output the changes in the format consistent with the data file loaded and the other would be to save them. If this doesn't make sense please comment and I can clarify. It should be noted this is done in a completely local environment.

Источник

Download part of HTML page as text (PHP)

So if anyone has suggestions that'd be great, maybe I'm overlooking something. Thanks!

This question says the JS route is not possible (probable)

the question is: the text its in the server/db or in the client only¡? i dont know but maybe the application consist in a user that generates that TEXT in the client only. evacuate that dude please.

Читайте также:  Try and catch java error

4 Answers 4

I would go with option 2. Simplest and fastest. The other ones are a bit contrived.

If you go with option 2, why even leave the textarea at all?

sweet, sweet validation. Thanks. We're going ot keep the text area because it still might be the case that someone wants to paste part of it. Though it might go away in the future.

I would suggest the following: make your button replace the whole DOM of the page with your text. After that, user will be able to simply press Ctrl + S or ⌘S. Not exactly what you want, but still a shortcut.

I guess you can do it with the following (jQuery):

$ (document.body).html ($ ('#textarea-id').html) 

Appreciate the suggestion, though since this is going to be in OSS-space, I'd like to do it right and not muss up the UE. Thanks!

Following your second option, you could trigger your script with a keyword to send the data as attachment.

Here’s an example of how it could look like:

if (isset($_GET['download'])) < header('Content-Type: application/octet-stream'); header('Content-Disposition: attachment;filename="dump.data"'); echo $data; exit; >else < echo ''; > 
  • TEXT ALREDY IN THE SERVER:
    • MAKE A GETFILE.PHP that reponse that text in a file.
    • POST THE TEXT TO A GETFILE.PHP and response the file.
    • POST THE TEXT TO A GETFILE.PHP, storage the file and provide a LINK to DOWNLOAD (then you could delete or not the file, depending of your needs)

    This question is in a collective: a subcommunity defined by tags with relevant content and experts.

    Источник

    How to Convert HTML to Plain Text

    HTML (Hypertext Markup Language) is used to create Web pages. It is text based but includes "tags" that define how text on a Web page is displayed. But because the HTML codes are hidden by the Web browser, you can copy the viewable text from the browser window and paste it into any application that accepts text, such as the free editor included with Windows (Notepad) or Macintosh OSX (TextEdit). Some Web browsers can also save HTML pages as text without the tags.

    Save As Text with Web Browser

    Open the HMTL document or Web page you want to convert to text in your Web browser software.

    Click the File menu and select "Save As" (or page menu and "Save As" in Internet Explorer).

    Choose "Text Pages" from the drop-down format menu and choose a destination for the text file.

    Click "Save," then exit the browser and locate the text file you saved. You can open this in any application that can read a text file such as NotePad in Windows. Some text editors who read the saved HTML file might display it with no carriage returns or line breaks. If that happens, try the next method.

    Copy and Paste to Notepad (Windows) or TextEdit (Macintosh)

    Open the HMTL document or Web page you want to convert to text in your Web browser software.

    Click and drag over the text you want to convert or press Ctrl+A (Command+A on Macintosh) to select the entire page.

    Click the "Edit" menu, then "Copy."

    Open the Notepad application (Windows) or the TextEdit application (Macintosh) and click "Paste" under the "Edit" menu. You will now have the text contents of the Web page in the text editor window.

    Click the "File" menu, then "Save As" to save the new document as a text file, and you will have successfully converted the Web page HTML to text.

    • A full-feature Web editing tool such as Adobe Dreamweaver can open an HTML page into a text-only view and has a command under the "Edit" menu that allows you to copy the text, not the tags. You could then paste that into another application. You can also copy and paste Web page content into word-processing software such as Microsoft Word. After pasting, click the "File" menu and select "Save As" ("Office" menu, then "Save As" in Word 2007) and choose "Text Only" as the format for this document if you want to keep it in "plain text." Otherwise, choose the word processor's standard format (.doc for Word) and you will have a document that used to be HTML but is now text without the HTML codes.
    • The Safari Web browser does not have an option to save a Web page as a text-only document. Use the copy-and-paste method to convert the HTML to text.

    Katelyn Kelley worked in information technology as a computing and communications consultant and web manager for 15 years before becoming a freelance writer in 2003. She specializes in instructional and technical writing in the areas of computers, gaming and crafts. Kelley holds a Bachelor of Arts in mathematics and computer science from Boston College.

    Источник

Оцените статью