freq2.py: compute frequencies and percentages from column data.
This is a conversion of freq2.pl to python. The script, on Unix/Linux, can be run on multiple log files.Limitations:
- This program is intended to be run from a Unix or Linux command line.
- If you run it on Windows or Mac, be sure to change the lineends to the appropriate ones for your operating system.
- Computes frequencies, percents, cumulative frequencies, and cumulative percents.
- Can only print the first 20 characters of a value, although can compute frequencies for longer values.
- The order of values in the output is set by a character sort rather than a numeric sort (the program knows nothing of variable types like numeric or string).
- If you give a column range outside the logical record length of the data, you will still get output, but all values will be null.
- Since the program summarizes values, if you have dots in the data (as SAS missing), they will be counted. And of course, values like NA and 999 but they will not be labeled as missing.
- Output is sent to STDOUT, so redirect it to a file to save it.
- Show freq2.py code
- Download: freq2.py
This utility is part of a collection of more text-processing tools. There are only three python utilities at this time. There are many more Perl utilities. As they get converted, they'll be added to this page.
Usage
Assuming this program is executable, and the line ends match your OS (Unix: LF, Windows LFCR, Mac CR) its commandline is:freq2.py [-h] [-c[#-#][#]] [filename...]
Example: freq2.py -c5-7 fylename.dat (Column numbers in the file start at column 1)
- where: -c#-# indicates Start-End column numbers of the variable
- -c#indicates a single-column variable at column #
- filename LRECL data file filename
- -h Show this help screen
As an example, the command freq2.py -c1 data.file might produce output like:
> freq2.py -c1 data.file Page 1 Frequencies for the values in columns 1-1 in the file "data.file" Cumulative Cumulative Value Frequency Percent Frequency Percent ------- --------- ------- ---------- ---------- 0 2214 11.37 2214 11.37 1 1009 5.18 3223 16.55 2 533 2.74 3756 19.28 9 15721 80.72 19477 100.00 Results from pspp 2.01 for the same data: cntry +-------+---------+-------+-------------+------------------+ | |Frequency|Percent|Valid Percent|Cumulative Percent| +-------+---------+-------+-------------+------------------+ |Valid 0| 2214| 11.4%| 11.4%| 11.4%| | 1| 1009| 5.2%| 5.2%| 16.5%| | 2| 533| 2.7%| 2.7%| 19.3%| | 9| 15721| 80.7%| 80.7%| 100.0%| +-------+---------+-------+-------------+------------------+ |Total | 19477| 100.0%| | | +-------+---------+-------+-------------+------------------+ You can see the counts, percents, cumulative percents, match.
Back to Kent's Python Page
Last Modified: Tue Mar 17 21:58:42 EDT 2026