Software utensils
Wednesday, April 29, 2026
Thinking model expense is hard to predict
Monday, December 26, 2022
GPT as a learning tool
Normally when I learn some new concept online, I spend a lot of extra time with supplementary reading looking to confirm or disprove a model of that concept in my mind. But In this case with GPT I could skip this step and just summarize what I thought I knew about the topic and ask GPT to compare what I was saying with its own understanding.
I also really like the ability of GPT to confirm whether a given technique in an engineering implementation is common. I think a lot of the problem of managing risk in an engineering project is to try to keep your integration points on the beaten path, i.e., it is best to structure how you connect to shared tools such that these connections are consistent with the way other users are using the tool; this way you are more likely to be in line with the design and less subject to surprises -- and you are less likely to be caught out with an awkward upgrade path as the software evolves.
Friday, September 9, 2022
Running ssh tunneled X clients after su
To manage this situation, I wrote a pair of scripts and aliases. X.auth_save saves the authentication away to /tmp:
#!/bin/bash . X.auth_saved.inc auth=`xauth -f $HOME/.Xauthority list | tail -1` echo $auth > $x_auth_saved chmod 777 $x_auth_saved echo "Saved X authority $auth to $x_auth_saved"Restore as a new user with X.auth_restore:
#!/bin/bash
. X.auth_saved.inc
if [ -r $x_auth_saved ]; then
echo "OK found \"$x_auth_saved\"" 1>&2
else
echo "FAIL could not find \"$x_auth_saved\"" 1>&2
exit 1
fi
auth=`cat $x_auth_saved`
echo "Restored X authority $auth from $x_auth_saved"
touch $HOME/.Xauthority
if ! xauth add $auth; then
echo "FAIL: xauth add $auth failed, exiting..." 1>&2
exit 1
else
echo "OK xauth add $auth"
fi
Lastly track the shared file variable name in X.auth_saved.inc:
x_auth_saved=/tmp/X.auth_savedSince the names are unwieldy, I aliased them to xas and xar:
alias xas=X.auth_save alias xar=X.auth_restore
Monday, May 21, 2018
Memoization: an easy way to mock dependencies
It occurred to me a bit later that a broader purpose could be served by memoization in this instance. One of the least attractive aspects of multivcs_query is its testing dependency on the existence of such a variety of source control management systems and also particular code lines which will surely not exist on most people's servers. I had thought about having some sort of bootstrap code which would establish simple code lines with enough content and history to support the tests I need to run, but of course that would be a significant amount of work and also keep the unfortunate assumption that the source control systems involved are even installed on the local system, a good bet with git, but probably not with any of the others. But if multivcs_query uses wrappers with memoization for its interactions with every source control systems it uses, then all I need to do is seed the memoization cache with appropriate data and my tests will work even if none of the source control systems exist on a local server. For each request, the memoization layer will see a hit in the cache and immediately return valid results, never even attempting to run the (possibly non-existent) version control systems that are theoretically involved. And that means I can pursue develop work for multivcs_query on a laptop which naturally has none of the server-side source control software installed.
To make this beautiful vision a reality, a several of pieces of work were required.
- I had to change cache.pl not to generate its cached result files by simply concatenating the inputs -- I was immediately exceeding the filename maximum length with calls referring to multiple source code files. So instead what I do is follow the old method to generate a (sometimes very long) string key, and then just call cksum with that key to make a unique ID.
-
Then, to avoid cache contents becoming unmanageably opaque, I also save a companion file with a
.cmd
suffix recording the actual command that was run. -
Now that I want all test cases which are based on these version control system dependencies to use the cache, I am frequently looking at the current cache and extracting the appropriate files needed to be saved away for future successful test runs. Especially now that the cache files have names based on cksum values, it is no longer a simple matter to look at the cache directory and understand what is what. To make this situation transparent, I have implemented a simple utility cache.ls to list the contents of the cache and propose commands to copy the relevant files to a new location (presumably the folder containing the same to cache contents needed for successful test runs). Here is the code for cache.ls:
#!/bin/bash search_args=$* if [ -z "$search_args" ]; then search_args=. fi for cf in `ls $TMP/cache* | grep -v 'cmd$'`; do if cat $cf.cmd | grepm $search_args; then cat $cf echo EOD echo "cp -p $cf* ." echo '----------------------------------------------------------------------------------------------' fi done -
Finally, it is important to initialize the cache with the appropriate data when running on a host for the first time. In my test wrapper, I added the following code to publish this task:
if [ ! -f $TMP/CACHE_SEEDED_FOR_TESTS ]; then echo "Initializing cache data for test runs on this host:" echo "cp -p test/cache_seed/* $TMP..." if ! cp -p test/cache_seed/* $TMP; then echo "$0: cp -p test/cache_seed/* $TMP failed, exiting..." 1>&2 exit 1 fi if ! touch $TMP/CACHE_SEEDED_FOR_TESTS; then echo "$0: touch $TMP/CACHE_SEEDED_FOR_TESTS failed, exiting..." 1>&2 exit 1 fi fi
use strict;
use IO::File;
my $__trace = 0;
sub get_cached_output_path
{
my($extra_key, $s) = @_;
my $key = $extra_key . $s;
my $fn_base = `echo $key | cksum`;
chomp $fn_base;
$fn_base =~ s/ .*//;
my $fn = "$ENV{'TMP'}/cache." . $fn_base;
my $f = new IO::File("$fn.cmd", "w");
$f->write($key);
$f->close();
return $fn;
}
my @argv = @ARGV;
my $extra_key = $ENV{"CACHE_EXTRA_ARG"};
$extra_key = "" if !defined $extra_key;
if ($argv[1] eq "-cache-clear")
{
my $cached_output_stem = get_cached_output_path($extra_key, $argv[0]);
die "empty output stem" unless $cached_output_stem;
my $cmd = "rm -f $cached_output_stem* 2> /dev/null";
print "$cmd\n" if $__trace;
print `$cmd`;
exit(0);
}
my $cmd = join('" "', @argv);
$cmd =~ s/(" ")*$//g;
$cmd = '"' . $cmd . '"';
$cmd =~ s/"([\w_#,\.\/]+)"/$1/g;
print "cmd=$cmd\n" if $__trace;
my $cached_output = get_cached_output_path($extra_key, $cmd);
if (-f $cached_output)
{
print "using existing $cached_output\n" if $__trace;
}
else
{
my $cmd_with_redirects = "$cmd > $cached_output 2> $cached_output.err";
`$cmd_with_redirects`;
if ($__trace)
{
print "Executed $cmd_with_redirects\n";
}
if ( `cat $cached_output.err` eq '' )
{
if ($__trace)
{
print "No error output, so deleting $cached_output.err\n";
}
unlink "$cached_output.err";
}
}
print `cat $cached_output`;
if (-f "$cached_output.err" )
{
print STDERR `cat $cached_output.err`;
# assume trouble if there was output to stderr, and remove the cached output:
unlink "$cached_output.err";
unlink $cached_output;
}
* It really is strange -- svn in particular has about a 2 second overhead for me no matter how simple my call. I'm guessing this is some sort of pathological misconfiguration of the local subversion server, but I don't control it and it is tangential enough to the central purpose of multivcs_query that I can't justify launching a campaign to improve it. But thanks to memoization, I don't have to care too much.
Friday, March 18, 2016
simple grep replacement for sorted data
For example, many log files start with a timestamp composed in a sortable order (i.e., year, month, day, hour, minute, second), where you may well have a fairly precise idea of what you want (e.g., from 12:02:05 on March 3, 2016 and the 10 seconds following). Normal grep will scan the entire file, which is a painful exercise if the file is many gigabytes. sgrep will in contrast seek to the interesting region and then search for a regular expression pattern only within that region.
So for the example given above, one would search as follows if looking at an artifactory request log:
sgrep 20160303120205 20160303120215 some_regex_pattern /private/artifactory/logs/request.log
Here's the code to do it, changing some minute searches to sub-second searches:
class Sgrep
attr_accessor :patt
attr_accessor :beginning_of_significance
attr_accessor :ending_of_significance
attr_accessor :fn
attr_accessor :f
def initialize(beginning_of_significance, ending_of_significance, patt, fn)
self.beginning_of_significance = beginning_of_significance
self.ending_of_significance = ending_of_significance
self.patt = Regexp.new(patt)
self.fn = fn
if !File.readable?(fn)
STDERR.puts "sgrep: #{fn}: No such file or directory"
exit(1)
end
self.f = File.open(fn, "r")
puts "sgrep looking for \"#{patt}\" in file #{fn}, bounded by the significant region starting with \"#{beginning_of_significance}\" and ending with \"#{ending_of_significance}\"..." if Sgrep.trace
# header = fh.readline
# Process the header
# while(line = fh.gets) != nil
# #do stuff
# end
# end"@@
end
def search_sequentially_from(pos)
if pos > 0
f.seek(pos-1, IO::SEEK_SET)
self.seek_next_line
else
f.seek(pos, IO::SEEK_SET)
end
while !self.f.eof? do
next_line_start = self.f.tell
line = self.f.gets
if line.start_with?(self.beginning_of_significance) || line > self.beginning_of_significance
f.seek(next_line_start, IO::SEEK_SET)
puts "searched sequentially to #{next_line_start} (seeing #{line})" if Sgrep.trace
return
end
end
end
def seek_beginning_of_significance()
lower_bound = 0
upper_bound = File.size(self.fn)
puts "lower_bound=#{lower_bound}, upper_bound=#{upper_bound}" if Sgrep.trace
while upper_bound > lower_bound do
midpoint = (lower_bound + ((upper_bound - lower_bound) / 2)).to_i
f.seek(midpoint, IO::SEEK_SET)
self.seek_next_line
next_line_start = self.f.tell
line = self.f.gets
puts "see #{line.chomp}, lower_bound=#{lower_bound}, upper_bound=#{upper_bound}, midpoint=#{midpoint}, next_line_start=#{next_line_start}" if Sgrep.trace
if line.start_with?(self.beginning_of_significance)
puts "match" if Sgrep.trace
if upper_bound > next_line_start
upper_bound = next_line_start
else
break
end
elsif line < self.beginning_of_significance
puts "under" if Sgrep.trace
lower_bound = self.f.tell+1
else
puts "over" if Sgrep.trace
upper_bound = midpoint-1
end
end
self.search_sequentially_from(lower_bound)
end
def seek_next_line()
while !self.f.eof? do
if self.f.getc == "\n"
return
end
end
end
def significant_lines
while !self.f.eof? do
line = f.gets
if line.start_with?(self.ending_of_significance) || line < self.ending_of_significance
yield line
else
return
end
end
return
end
def search()
#return `grep "#{self.patt}" #{self.fn}`
self.seek_beginning_of_significance
exit_code = 1
self.significant_lines do | line |
if self.patt.match(line)
exit_code = 0
print line
end
end
return exit_code
end
class << self
attr_accessor :trace
end
end
j = 0
while ARGV[j].start_with?("-") do
case ARGV[j]
when "-v"
Sgrep.trace = true
end
j += 1
end
beginning_of_significance = ARGV[j]
ending_of_significance = ARGV[j+1]
patt = ARGV[j+2]
fn = ARGV[j+3]
g = Sgrep.new(beginning_of_significance, ending_of_significance, patt, fn)
exit(g.search())
Sunday, June 7, 2015
Generating the simplest possible nginx load-balancing config
:
if [ -z "$1" ]; then
echo "Usage: $0 URI HOST1 HOST2 ..."
exit 1
fi
uri="$1"
shift
hosts=$@
out=/etc/nginx/nginx.conf
if [ ! -f $out.bak ]; then
mv $out $out.bak
fi
cat <<EOF > $out
events {
}
http {
upstream configuration_servers {
EOF
for h in $hosts; do
echo " server $h:8080;"
done >> $out
cat <<EOF >> $out
}
server {
listen 80;
location $uri {
proxy_pass http://configuration_servers$uri;
}
}
}
EOF
cat $out
service nginx restart
url=localhost:80$uri
if ! curl $url; then
echo "$0: curl $url failed, exiting..." 1>&2
exit 1
fi
Monday, January 12, 2015
Sending strangers and anonymous callers to voicemail in Google Voice
I can understand that you would need to have a won't-take-no-for-an-answer kind of personality to work as a debt collector, but wow those people are not fun to talk to. But I didn't want to just block strangers and anonymous callers, since once in a while those calls are legitimate. Doctors offices, for example, typically call anonymously in order to protect the privacy of their patients. So what I really wanted to do was to send all those folks directly to voicemail.
Sounds simple enough, and I was readily assured that this was possible, but I never did catch up with an explicit recipe to do it.
So here is one:
- Browse to Google voice settings
- on the Phones tab, disable all of your devices (i.e., uncheck the checkbox associated with each)
- on the Voicemail & Text tab, click the edit button for the special Callers group "All Contacts"
- enable all devices that you want to ring when one of your contacts calls you