How to select one-column only from online chart?

Status
Not open for further replies.

MarcAI

Prominent
Mar 8, 2017
7
0
510
Hello,

How can one-column (as opposed to all the content of a webpage) be selected? I'm attempting to select one-column from a website chart!

Thank-you
Marc
 
Solution
Not that I am aware of. Possible that there might be some add-on that could assist with extracting the data - not optimistic.

If the source chart does not provide some built in"download data" option (button) for you then things can become cumbersome.

Data extraction requires a high level of consistency allowing identification of rows and columns plus the data types (text, number, date).

For example if I go to a webpage via Mozilla and find a chart to look at I have a variety of right-click options: E.g., "Save Page as", Save Page to Pocket", "Print", and "View Page Source".

"View Page Source" will present the HTML code behind the chart. Probably a mix of HTML, Java, maybe some data base links worked in to show the data but not...
Are you able to provide a link to the source chart? That would be helpful.

It is also important to know what you want to do with that column: e.g., re-display it by itself, do calculations, search...?

Trusting that you are not doing any copywrite violations.....

You may be able to capture/copy and paste in any number of ways.

Does the website provide a download option for the chart's data?

If so, it should be fairly straightforward depending on the data's format.

Once you download the chart's content(s) for example via .csv, .txt, or .xls then import that data into some application that recognizes the columns. Then you use a macro to pull out or otherwise extract the desired column and data.

Otherwise what you might do is highlight then copy (CTRL + C) the desired chart.

Paste it into notepad, excel, etc. to see how things turn out.


 
Not that I am aware of. Possible that there might be some add-on that could assist with extracting the data - not optimistic.

If the source chart does not provide some built in"download data" option (button) for you then things can become cumbersome.

Data extraction requires a high level of consistency allowing identification of rows and columns plus the data types (text, number, date).

For example if I go to a webpage via Mozilla and find a chart to look at I have a variety of right-click options: E.g., "Save Page as", Save Page to Pocket", "Print", and "View Page Source".

"View Page Source" will present the HTML code behind the chart. Probably a mix of HTML, Java, maybe some data base links worked in to show the data but not necessarily make the data available.

The starting point might be to copy the HTML code (trusting that the data is truly there) and paste the results into a simple text editor. Notepad perhaps. Again the data you desire may be "fed" to the image and not hard coded in. Especially if the chart data is dynamic: stock information, prices, some measurement that continually changes.

You need to provide code that can go into the Notepad text and somehow identify ways to distinguish rows and columns containing the desired information. Parsing is the terminology used.

Now you might be able to download the code (setting aside copyright issues for the moment) using Print and sending that output to a file as text. Then your code goes to that file and picks out (for example) the next 32 characters after the first comma in every other line starting at line 40 . Again tedious and cumbersome.

Plus if the webmasters change source code then your code must be changed. Maybe the next 34 characters every 3rd line, beginning at line 36.

[Side bar: Going back to copyright: If the data is public then you should be able to use it freely. If the data is an internal webpage within your organization then you may be able to get direct access by talking with the Webmaster or Data Manager.
If the data source is private or otherwise copyrighted you should get permission to use the data or at a minimum give proper credit to the source.]

Proceeding:

I can simplify the capture process and skip the HTML code part. If I highlight the chart (all headers, rows, columns), and do a CTRL + C (copy) I can then open Notepad and do a CRTL + V to paste the data in via a simple text form. The data is now accesible but the spacing (headers, rows,columns) is all scattered about. Different spacing, some punctuation marks, etc. May or may not wrap - one long line of text perhaps.

Either way Copy/Paste will lose most if not all of the original chart formatting. But depending what formatting and consistency remains I may be able to import the file directly into Excel as .csv or .txt that ends up with a spreadsheet where every 4 row, second column is the desired information. Or if the chart is simple just manually edit it the Copy/Paste results to isolate and organize the desired chart data. Excel (and Access) have some powerful importing tools and options.

My suggestion (trying be useful) is to directly Copy and Paste the entire chart into Notepad. Then without change import the data to Excel.

See what happens and then refine the process to drill down to just the single column of data you need. Some steps may be eventually be automated via Excel Macros. Key issue is the consistency of the data and its format.

Excel has some excellent parsing capabilities: E.g., if the imported data ends up with each row of the chart in a single cell, you can easily parse out characters 40 to 80 if that is where the data is.

And if any numerical data is involved (e.g., "123.45") it came across as text. You will need to tell Excel to convert that text to number so you can add or otherwise do mathematical operations on that data.



 
Solution

MarcAI

Prominent
Mar 8, 2017
7
0
510


Thank-you for taking the time to describe possible solutions.

The type of column data I'm seeking comes from an Internet forum.

Any APPS available to select columns in Internet forums?


 
No problem - you are welcome. However, looking back, I probably went too far into the "TLDR" zone.

Somewhat late to ask but do you just need to display the column as is, capture the data to display as data only, or capture the data to manipulate in some manner? E.g., some calculation, a search, a sort, some new link?

You could do things such as capture the chart as an image, use a photo editor to crop the image to the desired columns.

Again, copyright is likely to be involved so you must abide by the applicable rules.

If the column is in an internet forum there is still HTML or other code behind the webpage regardless of its nature.

Depending on the nature of the forum and data you may be able to contact the webmaster and request that some download options be provided. Easy enough to download the entire chart into a spreadsheet and then you just link your spreadsheet to the columns you need.




 
Status
Not open for further replies.