Thursday, September 6, 2012

Get Your Text Data Into Maxima

Comma Separated Value (CSV) files are often a convenient way to move your text data between a variety of software products. The makeup of a CSV file is straightforward and can be observed by opening up a CSV file in a text editor. Excel provides the ability to save a worksheet to a CSV file.

Once your data is in a CSV file, you can read it into Maxima—with a little bit of code.

But, first, some house-keeping. In order to have convenient access to routines that you or someone else has written, you'll want to put the files that contain this code on Maxima's search path. The variables file_search_maxima and file_search_lisp, indicate the paths that Maxima will look through to find code to load. You can affect these values by modifying these during start-up (or any other run-time). Place a file called maxima-init.mac in the location indicated by the environment variable MAXIMA_USERDIR and place some code to the effect of the following in it:

file_search_maxima: 
    append(file_search_maxima, 
           ["fullpathto/Useful/Maxima/Functions/$$$.{mac,mc}"]);
file_search_lisp: 
    append(file_search_lisp, 
           ["fullpathto/Useful/Maxima/Functions/$$$.{o,lisp,lsp}"]);

(See here for more settings and see chapter 32. Runtime Environment.)

To see more about how to interoperate between Lisp and Maxima, see chapter 3 of the Maxima manual. The downloadable sample code shows how to use some of these techniques.

Download the following two files and place them in the directory you have added to your search path.
  1. CSV.lisp
  2. CSV.mac
Start up Maxima and then type and execute:

load(csv);

You will now have access to 4 functions to help you get your text data into Maxima.


Function:
CSVread(filename)
CSVread(filename,linesToSkip)

CSVread will read in the data in the file and turn it into a list of lists of string data. Each line of data becomes a separate list. Optionally, you may specify a positive integer value indicating how many rows to discard, which is handy if your data file contains (possibly multiple) header rows.


Function:
CSVNumberColumns(filename, skips, columnsToKeep)

CSVNumberColumns reads data in from the file specified by filename and skips skips rows of data (which may be 0). Additionally, it will excerpt the file by only retaining the columns requested in the list columnsToKeep. To keep all of the columns, pass in an empty list ([]). If you want the first and second column only, pass in [1,2]. All data in the columns which are to be retained is expected to be a valid Maxima expression in string form. Numeric or algebraic expressions qualify. Specifically, the data items must be something which the Maxima function parse_string can interpret.


Function:
MakeMatrix(d)

MakeMatrix creates a matrix with the data d, which is a list of lists. This is handy when you need your data in matrix form to pass along to a least squares function in Maxima, such as lsquares_estimates. Note that this function does not create a "deep" copy of the data in d. It is really just a separate "view" of the same data. As a note, I found that the display of the matrix results would sometimes be transposed. Only seemed to happen in output cells in Maxima and actual data format was as expected anyway (display glitch in my version of Maxima but not a consistent one).



Example:

load(csv)$ /* only needed once per Maxima start-up */
d: CSVNumberColumns("D:/path/circledata.csv",1,[2,3]);
m: MakeMatrix(d);

(%o9) [[1007.265919151515,1000.932075151515],[1008.902086,1002.038759],[1010.969034,1002.554592],[1013.1443,1002.268315],[1015.598881,1000.804232],[
1016.837804,998.9016388],[1017.35833,996.4303561],[1016.548809,993.5742165],[1014.687232,991.6508387],[1012.069706,990.6322456],[1009.539441,990.933599],[
1007.03337,992.4675972],[1005.545131,995.07334],[1005.520783,997.8698301]]
(%o10) matrix([1007.265919151515,1000.932075151515],[1008.902086,1002.038759],[1010.969034,1002.554592],[1013.1443,1002.268315],[1015.598881,1000.804232],[1016.837804,998.9016388],[1017.35833,996.4303561],[1016.548809,993.5742165],[1014.687232,991.6508387],[1012.069706,990.6322456],[1009.539441,990.933599],[1007.03337,992.4675972],[1005.545131,995.07334],[1005.520783,997.8698301])

No comments: