Writing a text filter
This is an outline of how to write your own text filter user tool.
Contents
Why?
I worked out how to do this because I wanted to filter error messages from a huge SQL script output file, but you can use this technique to manipulate a file in any way. Imagine if search and replace had even more power than regular expressions. The only limits are your programming ability and imagination.
How?
You set up a user tool to run your filter - select the "Run as text filter" option. You don't need to use any of the special arguments like $(FileName) because text filters are always run on the content of the current EditPlus window. The command and argument settings will vary according to the way your filter must be called. Of course, you also have to write the filter. My example below is Java, but any language that can read the standard input stream and write to the standard output stream is fine. If you're familiar with the idea of writing a utility that runs in a command line pipe, this is very similar. The general approach is that you are fed the content of the current file which you read. Your code decides what to do with this input. It can output some or all of the input, add or replace sections, generate something entirely new. Meanwhile, you can also do anything else you fancy with the text, like e-mail the juicy bits to your granny.
What if it goes wrong?
Just like using search and replace, if you don't like what the filter has done to your text, you can undo.
Examples
Java
This Java code removes from SQL script output messages that indicate that things have worked correctly, leaving only error messages.
import java.io.*; import java.util.HashSet; public class SPOutStripper { static HashSet strippers = new HashSet (); static { strippers.add ( "" ); // a blank line strippers.add ( "Table dropped." ); strippers.add ( "Table created." ); strippers.add ( "1 row created." ); strippers.add ( "Commit complete." ); strippers.add ( "Table altered." ); strippers.add ( "1 row updated." ); // ...and many others } public static void main ( String [] args ) throws Exception // Lazy programmer hopes IOException won't bite him { BufferedReader in = new BufferedReader ( new InputStreamReader ( System.in ) ); PrintWriter out = new PrintWriter ( new BufferedWriter ( new OutputStreamWriter ( System.out ) ) ); String line; // Loop through lines of input while ( null != ( line = in.readLine () ) ) { // Check whether line should be stripped out if ( ! strippers.contains ( line ) ) { // If it shouldn't, send it back out again out.println ( line ); } } out.flush (); // Important! // Finished - tidy up out.close (); in.close (); } }
Perl
Perl code for removing leading and trailing whitespace (spaces and tabs)
#!/usr/bin/perl use warnings; use strict; while (my $text = <STDIN>) { chomp $text; $text =~ s/^[ \t]+|[ \t]+$//g; print "$text\n"; }
Javascript or VBScript
This example is in Javascript. It works basically the same in VBScript. Run as: cscript //NoLogo "c:\path to tool\tool.js"
var stdin = WScript.StdIn; var stdout = WScript.StdOut; var input = stdin.ReadAll(); /* Here you do something with the input. But since this is a demo, we're just going to write it back out. */ stdout.Write(input);
Python
This example attempts to tidy XML. It can be run as an EditPlus text filter tool, or from the command line.
import os,sys,re def openAnything(source): """Cribbed form diveintopython.org """ if source == "-": return sys.stdin # try to open with urllib (if source is http, ftp, or file URL) import urllib try: return urllib.urlopen(source) except (IOError, OSError): pass # try to open with native open function (if source is pathname) try: return open(source, 'r') except (IOError, OSError): pass # treat source as string import StringIO return StringIO.StringIO(str(source)) def prettyUp ( xml ): """ Based on http://www.faqts.com/knowledge_base/view.phtml/aid/4334/fid/538 """ parts = re.split ( '(<.*?>)', xml ) level = 0 wasText = False out = "" for part in parts: # ignore empty part if part.strip ( ) == '': continue # opening tags if part [ 0 ] == '<' and part [ 1 ] != '/' and part [ 1 ] != '?' and part [ 1 ] != '!': print sys.stdout.write ( '\t' * ( level ) + part ) # short-cut empty tag if part [ -2 : ] != '/>': level += 1 wasText = False # closing tags elif part [ : 2 ] == '</': level -= 1 if not wasText: print sys.stdout.write ( '\t' * ( level ) ) sys.stdout.write ( part ) wasText = False # text else: sys.stdout.write ( part ) wasText = True if len ( sys.argv ) == 1: xml = openAnything ( "-" ).read () elif len ( sys.argv ) == 2: xml = openAnything ( sys.argv [ 1 ] ).read () else: xml = None sys.stderr.write ( "Wrong number of arguments.\n" ) if None != xml: prettyUp ( xml )
Python again
This is surprisingly useful. It lines up text into columns by inserting spaces, for example from:
9 whatever 999 whatever 99 whatever 9999 whatever
to:
9 whatever 999 whatever 99 whatever 9999 whatever
Note: This code has some quirks - but you can hit Undo if you don't like the result.
You'll need to use "Prompt for arguments" ("$(Prompt)")after the script name to get a dialog where you can specify the whatever to be lined up. For a regular expression match, start with / (so /c.t will line up cat, cot, etc.)
import os,sys,re def openAnything ( source ): """Cribbed form diveintopython.org """ if source == "-": return sys.stdin # try to open with urllib (if source is http, ftp, or file URL) import urllib try: return urllib.urlopen ( source ) except ( IOError, OSError ): pass # try to open with native open function (if source is pathname) try: return open ( source, 'r' ) except ( IOError, OSError ): pass # treat source as string import StringIO return StringIO.StringIO ( str ( source ) ) def findMarker ( line, marker ): if "/" == marker [ : 1 ]: match = re.search ( marker [ 1 : ], line ) if None == match: return -1 return match.start () return line.find ( marker ) def lineUp ( text, marker ): lines = re.split ( '\n', text ) maxStartLen = max ( findMarker ( line, marker ) for line in lines ) for line in lines: if 0 < len ( line ): pos = findMarker ( line, marker ) start = line [ : pos ] end = line [ pos : ] print start + ( ' ' * ( maxStartLen - len ( start ) ) ) + end if len ( sys.argv ) == 2: text = openAnything ( "-" ).read () marker = sys.argv [ 1 ] elif len ( sys.argv ) == 3: text = openAnything ( sys.argv [ 1 ] ).read () marker = sys.argv [ 2 ] else: text = None sys.stderr.write ( "Wrong number of arguments.\n" ) if None != text: lineUp ( text, marker )