Software utensils

Wednesday, April 29, 2026

Thinking model expense is hard to predict

When Opus 4.6 came out, my colleagues were very excited because prompts which had previously elicited incorrect or disorganized answers were instead yielding correct analysis. It was a significant advance, and the change in our environment really highlighted how central model quality was to our experience. Now the opinion was widely held that anthropic was at the top of the quality rankings, and this was consistent with what we saw comparing it with GPT, but what about the other many models available? I kept hearing about high-quality open source models that could rival the frontier models we were usually focused on, but up to that point had not bothered to actually test them. Since our access was coming via OpenRouter, it was relatively easy to run some queries across a wider variety of models without having to set up accounts, etc., and the results were intriguing. Opus did the best, and was as promised pretty expensive. To my surprise though, there were some models which were reputed to be cheap which were nearly as expensive (despite coming up with much worse answers). This despite the fact that the headline input and output token costs were much less for those other models. The key here was that these were all "thinking" models which also charge for intermediate reasoning tokens which apparently cannot be accurately predicted. This means that it is not practical to predict cost by any method besides actual tests; the headline price tags can be easily overwhelmed by the numbers of reasoning tokens expended, which in my experience was all over the map. This makes me think that we really need a tool which can easily look at a wide variety of models and evaluate their performance on our prompts, and also give us some grounding for our view of how to models stack up in terms of cost.

Monday, December 26, 2022

GPT as a learning tool

I've really liked using GPT as a tutor. I recently needed to learn about SSO/SAML2, and it was such a luxury to be able to pose questions to GPT, and in particular to be able to express my understanding of the protocol and have GPT either confirm or correct it. I think it is pretty clear that this capacity to evaluate our expressed understanding of a concept is going to really accelerate our learning.

Normally when I learn some new concept online, I spend a lot of extra time with supplementary reading looking to confirm or disprove a model of that concept in my mind. But In this case with GPT I could skip this step and just summarize what I thought I knew about the topic and ask GPT to compare what I was saying with its own understanding.

I also really like the ability of GPT to confirm whether a given technique in an engineering implementation is common. I think a lot of the problem of managing risk in an engineering project is to try to keep your integration points on the beaten path, i.e., it is best to structure how you connect to shared tools such that these connections are consistent with the way other users are using the tool; this way you are more likely to be in line with the design and less subject to surprises -- and you are less likely to be caught out with an awkward upgrade path as the software evolves.

Friday, September 9, 2022

Running ssh tunneled X clients after su

When you ssh to a machine and then run an X client, the authentication is based on having a valid entry in your ~/.XAuthority file, which your ssh client intializes automatically if you run ssh with -X or -Y. But if you su to be another user, that authorization cannot be found, and the X client initiailization will fail.

To manage this situation, I wrote a pair of scripts and aliases. X.auth_save saves the authentication away to /tmp:

  
#!/bin/bash
. X.auth_saved.inc
auth=`xauth -f $HOME/.Xauthority list | tail -1`
echo $auth > $x_auth_saved
chmod 777    $x_auth_saved
echo "Saved X authority $auth to $x_auth_saved"

Restore as a new user with X.auth_restore:

#!/bin/bash
. X.auth_saved.inc
if [ -r $x_auth_saved ]; then
        echo "OK found \"$x_auth_saved\"" 1>&2
else
        echo "FAIL could not find \"$x_auth_saved\"" 1>&2
        exit 1
fi
auth=`cat $x_auth_saved`
echo "Restored X authority $auth from $x_auth_saved"
touch $HOME/.Xauthority
if ! xauth add $auth; then
        echo "FAIL: xauth add $auth failed, exiting..." 1>&2
        exit 1
else
        echo "OK xauth add $auth"
fi

Lastly track the shared file variable name in X.auth_saved.inc:

x_auth_saved=/tmp/X.auth_saved

Since the names are unwieldy, I aliased them to xas and xar:

alias xas=X.auth_save
alias xar=X.auth_restore

Monday, May 21, 2018

Memoization: an easy way to mock dependencies

Years ago I wrote about a simple method to get the benefits of memoization for any commandline application whose outputs are a function of its commandline arguments. I used this technique this year while implementing multivcs_query, a utility for examining software source which spans multiple (and possibly incompatible) source code repositories. The tests for multivcs_query call out to all the major source control repository types, including git, subversion, perforce and even a variant of ClearCase. Memoization is key for the tests to run fast because of the dispiriting slowness of some of the source control programs multivcs_query supports.*

It occurred to me a bit later that a broader purpose could be served by memoization in this instance. One of the least attractive aspects of multivcs_query is its testing dependency on the existence of such a variety of source control management systems and also particular code lines which will surely not exist on most people's servers. I had thought about having some sort of bootstrap code which would establish simple code lines with enough content and history to support the tests I need to run, but of course that would be a significant amount of work and also keep the unfortunate assumption that the source control systems involved are even installed on the local system, a good bet with git, but probably not with any of the others. But if multivcs_query uses wrappers with memoization for its interactions with every source control systems it uses, then all I need to do is seed the memoization cache with appropriate data and my tests will work even if none of the source control systems exist on a local server. For each request, the memoization layer will see a hit in the cache and immediately return valid results, never even attempting to run the (possibly non-existent) version control systems that are theoretically involved. And that means I can pursue develop work for multivcs_query on a laptop which naturally has none of the server-side source control software installed.

To make this beautiful vision a reality, a several of pieces of work were required.

I had to change cache.pl not to generate its cached result files by simply concatenating the inputs -- I was immediately exceeding the filename maximum length with calls referring to multiple source code files. So instead what I do is follow the old method to generate a (sometimes very long) string key, and then just call cksum with that key to make a unique ID.
Then, to avoid cache contents becoming unmanageably opaque, I also save a companion file with a
```
.cmd
```
suffix recording the actual command that was run.
Now that I want all test cases which are based on these version control system dependencies to use the cache, I am frequently looking at the current cache and extracting the appropriate files needed to be saved away for future successful test runs. Especially now that the cache files have names based on cksum values, it is no longer a simple matter to look at the cache directory and understand what is what. To make this situation transparent, I have implemented a simple utility cache.ls to list the contents of the cache and propose commands to copy the relevant files to a new location (presumably the folder containing the same to cache contents needed for successful test runs). Here is the code for cache.ls:
```
#!/bin/bash
search_args=$*

if [ -z "$search_args" ]; then
        search_args=.
fi

for cf in `ls $TMP/cache* | grep -v 'cmd$'`; do
        if cat $cf.cmd | grepm $search_args; then
                cat $cf
                echo EOD
                echo "cp -p $cf* ."
                echo '----------------------------------------------------------------------------------------------'
        fi
done
```

Finally, it is important to initialize the cache with the appropriate data when running on a host for the first time. In my test wrapper, I added the following code to publish this task:

if [ ! -f $TMP/CACHE_SEEDED_FOR_TESTS ]; then
        echo "Initializing cache data for test runs on this host:"
        echo "cp -p test/cache_seed/* $TMP..."
        if !  cp -p test/cache_seed/* $TMP; then
                echo "$0: cp -p test/cache_seed/* $TMP failed, exiting..." 1>&2
                exit 1
        fi
        if ! touch $TMP/CACHE_SEEDED_FOR_TESTS; then
                echo "$0: touch $TMP/CACHE_SEEDED_FOR_TESTS failed, exiting..." 1>&2
                exit 1
        fi
fi

So that's how it works. For completeness, following is the updated cache.pl code using cksum to generate the cache filenames:

use strict;
use IO::File;

my $__trace = 0;

sub get_cached_output_path
{
  my($extra_key, $s) = @_;

  my $key = $extra_key . $s;
  
  my $fn_base = `echo $key | cksum`;
  chomp $fn_base;
  $fn_base =~ s/ .*//;
  
  my $fn = "$ENV{'TMP'}/cache." . $fn_base;
  
  my $f = new IO::File("$fn.cmd", "w");
  $f->write($key);
  $f->close();

  return $fn;
}

my @argv = @ARGV;
my $extra_key = $ENV{"CACHE_EXTRA_ARG"};
$extra_key = "" if !defined $extra_key;

if ($argv[1] eq "-cache-clear")
{
  my $cached_output_stem = get_cached_output_path($extra_key, $argv[0]);
  die "empty output stem" unless $cached_output_stem;
  my $cmd = "rm -f $cached_output_stem* 2> /dev/null";
  print "$cmd\n" if $__trace;
  print `$cmd`;
  exit(0);
}


my $cmd = join('" "', @argv);


$cmd =~ s/(" ")*$//g;
$cmd = '"' . $cmd . '"';
$cmd =~ s/"([\w_#,\.\/]+)"/$1/g;

print "cmd=$cmd\n" if $__trace;

my $cached_output = get_cached_output_path($extra_key, $cmd);

if (-f $cached_output)
{
  print "using existing $cached_output\n" if $__trace;
}
else
{
  my $cmd_with_redirects = "$cmd > $cached_output 2> $cached_output.err";
  `$cmd_with_redirects`;
  if ($__trace)
  {
    print "Executed $cmd_with_redirects\n";
  }
  if ( `cat $cached_output.err` eq '' )
  {
    if ($__trace)
    {
      print "No error output, so deleting $cached_output.err\n";
    }
    unlink "$cached_output.err";
  }
}
print `cat $cached_output`;
if (-f "$cached_output.err" )
{
  print STDERR `cat $cached_output.err`;
  # assume trouble if there was output to stderr, and remove the cached output:
  unlink "$cached_output.err";
  unlink $cached_output;
}

* It really is strange -- svn in particular has about a 2 second overhead for me no matter how simple my call. I'm guessing this is some sort of pathological misconfiguration of the local subversion server, but I don't control it and it is tangential enough to the central purpose of multivcs_query that I can't justify launching a campaign to improve it. But thanks to memoization, I don't have to care too much.

Friday, March 18, 2016

simple grep replacement for sorted data

The idea here is that if we are looking at a file which is composed of lines sorted in order, and you have some idea what the lines you're interested in will start with, you should be able to efficiently pull out your target, even if the file is large.

For example, many log files start with a timestamp composed in a sortable order (i.e., year, month, day, hour, minute, second), where you may well have a fairly precise idea of what you want (e.g., from 12:02:05 on March 3, 2016 and the 10 seconds following). Normal grep will scan the entire file, which is a painful exercise if the file is many gigabytes. sgrep will in contrast seek to the interesting region and then search for a regular expression pattern only within that region.

So for the example given above, one would search as follows if looking at an artifactory request log:

        sgrep 20160303120205 20160303120215 some_regex_pattern /private/artifactory/logs/request.log

Here's the code to do it, changing some minute searches to sub-second searches:

class Sgrep
        attr_accessor :patt
        attr_accessor :beginning_of_significance
        attr_accessor :ending_of_significance
        attr_accessor :fn
        attr_accessor :f
        def initialize(beginning_of_significance, ending_of_significance, patt, fn)
                self.beginning_of_significance = beginning_of_significance
                self.ending_of_significance = ending_of_significance
                self.patt = Regexp.new(patt)
                self.fn = fn
                if !File.readable?(fn)
                        STDERR.puts "sgrep: #{fn}: No such file or directory"
                        exit(1)
                end
                self.f = File.open(fn, "r")
                puts "sgrep looking for \"#{patt}\" in file #{fn}, bounded by the significant region starting with \"#{beginning_of_significance}\" and ending with \"#{ending_of_significance}\"..." if Sgrep.trace
                #        header = fh.readline
                # Process the header
                #     while(line = fh.gets) != nil
                #         #do stuff
                #     end
                # end"@@
        end
        def search_sequentially_from(pos)
                if pos > 0
                        f.seek(pos-1, IO::SEEK_SET)
                        self.seek_next_line
                else
                        f.seek(pos, IO::SEEK_SET)
                end
                while !self.f.eof? do
                        next_line_start = self.f.tell
                        line = self.f.gets
                        if line.start_with?(self.beginning_of_significance) || line > self.beginning_of_significance
                                f.seek(next_line_start, IO::SEEK_SET)
                                puts "searched sequentially to #{next_line_start} (seeing #{line})" if Sgrep.trace
                                return
                        end
                end
        end
        def seek_beginning_of_significance()
                lower_bound = 0
                upper_bound = File.size(self.fn)
                puts "lower_bound=#{lower_bound}, upper_bound=#{upper_bound}" if Sgrep.trace
                while upper_bound > lower_bound do
                        midpoint = (lower_bound + ((upper_bound - lower_bound) / 2)).to_i
                        f.seek(midpoint, IO::SEEK_SET)
                        self.seek_next_line
                        next_line_start = self.f.tell
                        line = self.f.gets
                        puts "see #{line.chomp}, lower_bound=#{lower_bound}, upper_bound=#{upper_bound}, midpoint=#{midpoint}, next_line_start=#{next_line_start}" if Sgrep.trace
                        if line.start_with?(self.beginning_of_significance)
                                puts "match" if Sgrep.trace
                                if upper_bound > next_line_start
                                        upper_bound = next_line_start
                                else
                                        break
                                end
                        elsif line < self.beginning_of_significance
                                puts "under" if Sgrep.trace
                                lower_bound = self.f.tell+1
                        else
                                puts "over" if Sgrep.trace
                                upper_bound = midpoint-1
                        end
                end
                self.search_sequentially_from(lower_bound)
        end
        def seek_next_line()
                while !self.f.eof? do
                        if self.f.getc == "\n"
                                return
                        end
                end
        end
        def significant_lines
                while !self.f.eof? do
                        line = f.gets
                        if line.start_with?(self.ending_of_significance) || line < self.ending_of_significance
                                yield line
                        else
                                return
                        end
                end
                return
        end
        def search()
                #return `grep "#{self.patt}" #{self.fn}`
                self.seek_beginning_of_significance

                exit_code = 1
                self.significant_lines do | line |
                        if self.patt.match(line)
                                exit_code = 0
                                print line
                        end
                end
                return exit_code
        end
        class << self
                attr_accessor :trace
        end
end

j = 0
while ARGV[j].start_with?("-") do
        case ARGV[j]
        when "-v"
                Sgrep.trace = true
        end
        j += 1
end
beginning_of_significance = ARGV[j]
ending_of_significance = ARGV[j+1]
patt = ARGV[j+2]
fn = ARGV[j+3]
g = Sgrep.new(beginning_of_significance, ending_of_significance, patt, fn)
exit(g.search())

Sunday, June 7, 2015

Generating the simplest possible nginx load-balancing config

I recently had to put together a load-balancing configuration for nginx and wish that I had a shell script to generate the very simplest possible nginx setup. I think I have it now:

:
if [ -z "$1" ]; then
echo "Usage: $0 URI HOST1 HOST2 ..."
exit 1
fi
uri="$1"
shift
hosts=$@
out=/etc/nginx/nginx.conf
if [ ! -f $out.bak ]; then
mv $out $out.bak
fi
cat <<EOF > $out
events {
}
http {
upstream configuration_servers {
EOF

for h in $hosts; do
echo " server $h:8080;"
done >> $out

cat <<EOF >> $out
}
server {
listen 80;
location $uri {
proxy_pass http://configuration_servers$uri;
}
}
}
EOF

cat $out

service nginx restart

url=localhost:80$uri
if ! curl $url; then
echo "$0: curl $url failed, exiting..." 1>&2
exit 1
fi

Monday, January 12, 2015

Sending strangers and anonymous callers to voicemail in Google Voice

I normally only write about software development in this blog, but I can't resist adding a software configuration recipe that I have found useful for cutting back on unwanted callers getting through to me via Google voice and my android phone. This problem recently became much worse when I acquired a new phone number which had previously been used by a woman named Melody, a Bay Area woman who apparently ran up lots of debts.

I can understand that you would need to have a won't-take-no-for-an-answer kind of personality to work as a debt collector, but wow those people are not fun to talk to. But I didn't want to just block strangers and anonymous callers, since once in a while those calls are legitimate. Doctors offices, for example, typically call anonymously in order to protect the privacy of their patients. So what I really wanted to do was to send all those folks directly to voicemail.

Sounds simple enough, and I was readily assured that this was possible, but I never did catch up with an explicit recipe to do it.

So here is one:

Browse to Google voice settings
on the Phones tab, disable all of your devices (i.e., uncheck the checkbox associated with each)
on the Voicemail & Text tab, click the edit button for the special Callers group "All Contacts"
enable all devices that you want to ring when one of your contacts calls you

That's it!