We have goldfish, normally kept in an outdoor pond. It’s not a deep enough pond that it would be safe to leave them out for a very harsh winter. So we keep as many as we can catch in a couple 150-gallon tanks in the basement.

Recently, and irritatingly close to when we’d set them outside, the nitrate level in the tanks grew too high. Fish excrete ammonia. Microorganisms then turn the ammonia into nitrates and then nitrates. In the wild, the nitrates then get used by … I dunno, plants? Which don’t thrive enough hin our basement to clean them out. To get the nitrate out of the water all there is to do is replace the water.

We have six buckets, each holding five gallons, of water that we can use for replacement. So there’s up to 30 gallons of water that we could change out in a day. Can’t change more because tap water contains chloramines, which kill bacteria (good news for humans) but hurt fish (bad news for goldfish). We can treat the tap water to neutralize the chloramines, but want to give that time to finish. I have never found a good reference for how long this takes. I’ve adopted “about a day” because we don’t have a water tap in the basement and I don’t want to haul more than 30 gallons of water downstairs any given day.

So I got thinking, what’s the fastest way to get the nitrate level down for both tanks? Change 15 gallons in each of them once a day, or change 30 gallons in one tank one day and the other tank the next?

And, happy to say, I realized this was the tea-making problem I’d done a couple months ago. The tea-making problem had a different goal, that of keeping as much milk in the tea as possible. But the thing being studied was how partial replacements of a solution with one component affects the amount of the other component. The major difference is that the fish produce (ultimately) more nitrates in time. There’s no tea that spontaneously produces milk. But if nitrate-generation is low enough, the same conclusions follow. So, a couple days of 30-gallon changes, in alternating tanks, and we had the nitrates back to a decent level.

We’d have put the fish outside this past week if I hadn’t broken, again, the tool used for cleaning the outside pond.

A couple of weeks after that — on Thanksgiving, it happens — we caught one more fish. This brought the total to 54. And I either failed to make note of it or I can’t find the note I made of it. Such happens.

In getting the pond ready for the spring, and the return of our goldfish to the outdoors, we found another one! It was just this orange thing dug into the muck of the pool, and we thought initially it was something that had fallen in and gotten lost. A heron scarer, was my love’s first guess. The pond thermometer that sank without trace some years back was mine. I used the grabber to poke at it and woke up a pretty sulky goldfish. It went over to some algae where we couldn’t so easily bother it.

So that brings our fish count to 55, for those keeping track. Fortunately, it was a very gentle winter in our parts. We’re hoping to bring the goldfish back out to the pond in the next week or two. Our best estimate for the carrying capacity of the pond is 65 to 130 goldfish, so, we will see whether the goldfish do anything about this slight underpopulation.

Folks who’ve been around a while may remember the matter of our fish. I’d spent some time in the spring describing ways to estimate a population using techniques other than just counting everybody. And then revealed that the population of goldfish in our pond was something like 53, based on counting the fifty which we’d had wintering over in our basement and the three we counted in the pond despite the winter ice. This is known as determining the population “by inspection”.

I’m disappointed to say that, as best we can work out, they didn’t get around to producing any new goldfish this year. We didn’t see any evidence of babies, and haven’t seen any noticeably small ones swimming around. It’s possible we set them out too late in the spring. It’s possible too that the summer was never quite warm enough for them to feel like it was fish-production time.

This does mean that we have a reasonably firm upper limit on the number of fish we need to take in. 53 appears to be it. And the winter’s been settling in, though, and we’ve started taking them in. This past day we took in twelve. That’s not bad for the first harvest and if we’re lucky we should have the pond emptied in a week or so. I’ll let folks know if there turn out to be a surprise in goldfish cardinality.

So, that was a fairly successful month. For June this blog managed a record 1,051 pages viewed. That’s just above April’s high of 1,047, and is a nice rebound from May’s 936. I feel comfortable crediting this mostly to the number of articles I published in the month. Between the Mathematics A To Z and the rush of Reading The Comics posts, and a couple of reblogged or miscellaneous bits, June was my most prolific month: I had 28 articles. If I’d known how busy it was going to be I wouldn’t have skipped the first two Sundays. And i start the month at 25,871 total views.

It’s quite gratifying to get back above 1,000 for more than the obvious reasons. I’ve heard rumors — and I’m not sure where because most of my notes are on my not-yet-returned main computer — that WordPress somehow changed its statistics reporting so that mobile devices aren’t counted. That would explain a sudden drop in both my mathematics and humor blogs, and drops I heard reported from other readership-watching friends. It also implies many more readers out there, which is a happy thought.

Unfortunately because of my computer problems I can’t give reports on things like the number of visitors, or the views per visitor. I can get at WordPress’s old Dashboard statistics page, and that had been showing the number of unique visitors and views per visitor and all that. But on Firefox 3.6.16, and on Safari 5.0.6, this information isn’t displayed. I don’t know if they’ve removed it altogether from the Dashboard Statistics page in the hopes of driving people to their new, awful, statistics page or what. I also can’t find things like the number of likes, because that’s on the New Statistics page, which is inaccessible on browsers this old.

Worse, I can’t find the roster of countries that sent me viewers. I trust that when I get my main computer back, and can look at the horrible new statistics page, I’ll be able to fill that in, but for now — nothing. I’m sorry. I will provide these popular lists when I’m able.

I can say what the most popular posts were in June. As you might expect for a month dominated by the A-To-Z project, the five most popular posts were all Reading The Comics entries:

A couple months ago I wrote about the problem of counting the number of goldfish in the backyard pond. For those who’d missed it:

How To Count Fish, which presented a way to estimate a population by simply doing two samplings of the population.

How To Re-Count Fish, which described some of the numerical problems in estimation-based population samples.

How Not To Count Fish, which threatened to collapse the entire project under fiddly practical problems.

Spring finally arrived, and about a month ago we finally stopped having nights that touched freezing. So we moved the goldfish which had been wintering over in the basement out to the backyard. This also let us count just how many goldfish we’d caught, and I thought folks might like to know what the population did look like.

The counting didn’t require probabilistic methods this time. Instead we took the fish from the traps and set up a correspondence between them and an ordered subset of positive whole numbers. This is the way you describe “just counting” so that it sounds either ferociously difficult or like a game. Whether it’s difficult or a game depends on whether you were a parent or a student back when the New Math was a thing. My love and I were students.

Altogether then there were fifty goldfish that had wintered over in the stock tank in the basement: eight adults and 42 baby fish. (Possibly nine and 41; one of the darker goldfish is small for an adult, but large for a baby.) Over the spring I identified at least three baby fish that had wintered over outdoors successfully. It was a less harsh winter than the one before. So there are now at least 53 goldfish in the pond. There are surely more on the way, but we haven’t seen any new babies yet.

Also this spring we finally actually measured the pond. We’d previously estimated it to be about ten feet in diameter and two feet deep, implying a carrying capacity of about 60 goldfish if some other assumptions are made. Now we’ve learned it’s nearer twelve feet in diameter and twenty inches deep. Call that two meters radius and half a meter height. That’s a volume of about 6.3 cubic meters, or 6300 liters, or enough volume of water for about 80 goldfish. We’ll see what next fall brings.

Of course I’m going to claim February 2015 was a successful month for my mathematics blog here. When have I ever claimed it was a dismal month? Probably I have, though last month wasn’t a case of it.

Anyway, according to WordPress’s statistics page, both the old and the new (which they’re getting around to making less awful), in February the mathematics blog had 859 views, down from January’s 944, but up from December’s 831. This is my second-highest on record. That said, I do want to point out that with a mere 28 days February was at a relative disadvantage for page clicks, and that January saw an average of 30.45 views per day, while February came in at 30.68, which is a record high.

There were 407 visitors in February, down from January’s 438 and December’s 424. 407 is the fourth-highest visitor count I have on record, though its 14.54 visitors per day falls short of January 2015’s 15.64, and way short of the all-time record, January 2013’s 15.26 visitors per day.

The views per visitor were at 1.96 in December, 2.16 in January, and dropped surely insignificantly to 2.11 for February, and there’s no plausibly splitting that up per day. Anyway, the mathematics blog started March at 21,815 views so there’s every reason to hope it’ll hit that wonderfully uniform count of 22,222 views soon.

The new statistics page lets me see that I drew 179 “likes” in February, down from 196 in January, but well up from December’s 128. Not to get too bean-counting but that is 6.39 likes per day in February against a mere 6.32 per day in January.

The most popular posts in February were mostly the comic strip posts, with the perennial favorite of trapezoids sneaking in. Getting more than thirty views each in February were:

How To Count Fish, which was somehow read three fewer times than the Re-Count one was.

Denominated Mischief, in which a bit of arithmetic manipulation proves that 7 equals 11.

In the listing of nations: as ever the countries sending me the most readers were the United States, with a timely 555; Canada with 83, and the United Kingdom with 66. The United States is down from January, but Canada and the United Kingdom strikingly higher. Germany sent 27 (up from 22), Austria 23 (down from 32), and Slovenia came from out of nowhere to send 21 readers this time around. India dropped from 18 to 6.

There were sixteen single-reader countries in February, up from January’s 14: Chile, Czech Republic, Hungary, Iceland, Ireland, Jamaica, Japan, Mexico, New Zealand, Philippines, Poland, Romania, Swaziland, Sweden, Venezuela, and Vietnam. The repeats from January are Hungary, Japan, and Mexico; Mexico is on a three-month streak.

There weren’t any really good, strange, amusing search terms bringing people here this past month, sad to say. The most evocative of them were:

topic about national mathematics day (I think this must be a reference to India’s holiday)

price is right piggy bank game (I’ve never studied this one, but I have done bits on the Item Up For Bid and on the Money Game)

jokes about algebraic geometry (are there any?)

groove spacing 78 and 45 (Yeah, I couldn’t find a definitive answer, but something like 170 grooves per inch seems plausible. Nobody’s taken me up on my Muzak challenge.)

two trapezoids make a (well, at least someone’s composing modernist, iconoclastic poetry around here)

sketch on how to inscribe more than one in a cycle in a triangle according to g.m green (I think this guy should meet the algebraic geometry jokester)

Last week I chatted a bit with a probabilistic, sampling-based method to estimate the population of fish in our backyard pond. The method estimates the population of a thing, in this case the fish, by capturing a sample of size and dividing that by the probability of catching one of the things in your sampling. Since we might know know the chance of catching the thing beforehand, we estimate it: catch some number of the fish or whatever, then put them back, and then re-catch as many. Some number of those will be re-caught, so we can estimate the chance of catching one fish as . So the original population will be somewhere about .

I want to talk a little bit about why that won’t work.

There is of course the obvious reason to think this will go wrong; it amounts to exactly the same reason why a baseball player with a .250 batting average — meaning the player can expect to get a hit in one out of every four at-bats — might go an entire game without getting on base, or might get on base three times in four at-bats. If something has chances to happen, and it has a probability of happening at every chance, it’s most likely that it will happen times, but it can happen more or fewer times than that. Indeed, we’d get a little suspicious if it happened exactly times. If we flipped a fair coin twenty times, it’s most likely to come up tails ten times, but there’s nothing odd about it coming up tails only eight or as many as fourteen times, and it’d stand out if it always came up tails exactly ten times.

To apply this to the fish problem: suppose that there are fish in the pond; that 50 is the number we want to get. And suppose we know for a fact that every fish has a 12.5 percent chance — — of being caught in our trap. Ignore for right now how we know that probability; just pretend we can count on that being exactly true. The expectation value, the most probable number of fish to catch in any attempt, is fish, which presents our first obvious problem. Well, maybe a fish might be wriggling around the edge of the net and fall out as we pull the trap out. (This actually happened as I was pulling some of the baby fish in for the winter.)

With these numbers it’s most probable to catch six fish, slightly less probable to catch seven fish, less probable yet to catch five, then eight and so on. But these are all tolerably plausible numbers. I used a mathematics package (Octave, an open-source clone of Matlab) to run ten simulated catches, from fifty fish each with a probability of .125 of being caught, and came out with these sizes for the fish harvests:

M =

4

6

3

6

7

7

5

7

8

9

Since we know, by some method, that the chance of catching any one fish is exactly 0.125, this implies fish populations of:

M =

4

6

3

6

7

7

5

7

8

9

N =

32

48

24

48

56

56

40

56

64

72

Now, none of these is the right number, although 48 is respectably close and 56 isn’t too bad. But the range is hilarious: there might be as few as 24 or as many as 72 fish, based on just this evidence. That might as well be guessing.

This is essentially a matter of error analysis. Any one attempt at catching fish may be faulty, because the fish are too shy of the trap, or too eager to leap into it, or are just being difficult for some reason. But we can correct for the flaws of one attempt at fish-counting by repeating the experiment. We can’t always be unlucky in the same ways.

This is conceptually easy, and extremely easy to do on the computer; it’s a little harder in real life but certainly within the bounds of our research budget, since I just have to go out back and put the trap out. And redoing the experiment even pays off, too: average those population samples from the ten simulated runs there and we get a mean estimated fish population of 49.6, which is basically dead on.

(That was lucky, I must admit. Ten attempts isn’t really enough to make the variation comfortably small. Another run with ten simulated catchings produced a mean estimate population of 56; the next one … well, 49.6 again, but the one after that gave me 64. It isn’t until we get into a couple dozen attempts that the mean population estimate gets reliably close to fifty. Still, the work is essentially the same as the problem of “I flipped a fair coin some number of times; it came up tails ten times. How many times did I flip it?” It might have been any number ten or above, but I most probably flipped it about twenty times, and twenty would be your best guess absent more information.)

The same problem affects working out what the probability of catching a fish is, since we do that by catching some small number of fish and then seeing how many some smaller number of them we re-catch later on. Suppose the probability of catching a fish really is , but we’re only trying to catch fish. Here’s a couple rounds of ten simulated catchings of six fish, and how many of those were re-caught:

2

0

1

0

1

0

1

0

0

1

2

0

1

1

0

3

0

0

1

1

0

1

0

1

0

0

1

0

0

0

1

0

0

0

0

0

0

0

2

1

Obviously any one of those indicates a probability ranging from 0 to 0.5 of re-catching a fish. Technically, yes, 0.125 is a number between 0 and 0.5, but it hasn’t really shown itself. But if we average out all these probabilities … well, those forty attempts give us a mean estimated probability of 0.092. This isn’t excellent but at least it’s in range. If we keep doing the experiment we’d get do better; one simulated batch of a hundred experiments turned up a mean estimated probability of 0.12833. (And there’s variations, of course; another batch of 100 attempts estimated the probability at 0.13333, and then the next at 0.10667, though if you use all three hundred of these that gets to an average of 0.12278, which isn’t too bad.)

This inconvenience amounts to a problem of working with small numbers in the original fish population, in the number of fish sampled in any one catching, and in the number of catches done to estimate their population. Small numbers tend to be problems for probability and statistics; the tools grow much more powerful and much more precise when they can work with enormously large collections of things. If the backyard pond held infinitely many fish we could have a much better idea of how many fish were in it.

We have a pond out back, and in 2013, added some goldfish to it. The goldfish, finding themselves in a comfortable spot with clean water, went about the business of making more goldfish. They didn’t have much time to do that before winter of 2013, but they had a very good summer in 2014, producing so many baby goldfish that we got a bit tired of discovering new babies. The pond isn’t quite deep enough that we could be sure it was safe for them to winter over, so we had to work out moving them to a tub indoors. This required, among other things, having an idea how many goldfish there were. The question then was: how many goldfish were in the pond?

It’s not hard to come up with a maximum estimate: a goldfish needs some amount of water to be healthy. Wikipedia seems to suggest a single fish needs about twenty gallons — call it 80 liters — and I’ll accept that since it sounds plausible enough and it doesn’t change the logic of the maximum estimate if the number is actually something different. The pond’s about ten feet across, and roughly circular, and not quite two feet deep. Call that a circular cylinder, with a diameter of three meters, and a depth of two-thirds of a meter, and that implies a volume of about pi times (3/2) squared times (2/3) cubic meters. That’s about 4.7 cubic meters, or 4700 liters. So there probably would be at most 60 goldfish in the pond. Could the goldfish have reached the pond’s maximum carrying capacity that quickly? Easily; you would not believe how fast goldfish will make more goldfish given fresh water and a little warm weather.

It can be a little harder to quite believe in the maximum estimate. For one, smaller fish don’t need as much water as bigger ones do and the baby fish are, after all, small. Or, since we don’t really know how deep the pond is — it’s not a very regular bottom, and it’s covered with water — might there be even more water and thus capacity for even more fish? That might sound ridiculous but consider: an error of two inches in my estimate of the pond’s depth amounts to a difference of 350 liters or room for four or five fish.

We can turn to probability, though. If we have some way of catching fish — and we have; we’ve got a wire trap and a mesh trap, which we’d use for bringing in fish — we could set them out and see how many fish we can catch. If we suppose there’s a certain probability of catching any one fish, and if there are fish in the pond any of which might be caught, then we could expect that some number fish are going to be caught. So if, say, we have a one-in-three chance of catching a fish, and after trying we’ve got some number fish — let’s say there were 8 caught, so we have some specific number to play with — we could conclude that there must have been about or 24 fish in the population to catch.

This does bring up the problem of how to guess what the probability of catching any one fish is. But if we make some reasonable-sounding assumptions we can get an estimate of that: set out the traps and catch some number, call it , of fish. Then set them back and after they’ve had time to recover from the experience, put the traps out again to catch fish again. We can expect that of that bunch there will be some number, call it , of the fish we’d previously caught. The ratio of the fish we catch twice to the number of fish we caught in the first place should be close to the chance of catching any one fish.

So let’s lay all this out. If there are some unknown number fish in the pond, and there is a chance of of any one fish being caught, and we’ve caught in seriously trying fish, then: and therefore .

For example, suppose in practice we caught ten fish, and were able to re-catch four of them. Then in trying seriously we caught twelve fish. From this we’d conclude that and therefore there are about fish in the pond.

Or if in practice we’d caught twelve fish, five of them a second time, and then in trying seriously we caught eleven fish. Then since we get an estimate of or call it 26 fish in the pond.

Or for another variation: suppose the first time out we caught nine fish, and the second time around, catching another nine, we re-caught three of them. If we’re feeling a little lazy we can skip going around and catching fish again, and just use the figures that and from that conclude there are about fish in the pond.

So, in principle, if we’ve made assumptions about the fish population that are right, or at least close enough to right, we can estimate what the fish population is without having to go to the work of catching every single one of them.

Since this is a generally useful scheme for estimating a population let me lay it out in an easy-to-follow formula.

To estimate the size of a population of things, assuming that they are all equally likely to be detected by some system (being caught in a trap, being photographed by someone at a spot, anything), try this:

Catch some particular number of the things. Then let them go back about their business.

Catch another of them. Count the number of them that you caught before.

The chance of catching one is therefore about .

Catch some number of the things.

Since — we assume — every one of the things had the same chance of being caught, and since we caught of them, then we estimate there to be of the things to catch.

Warning! There is a world of trouble hidden in that “we assume” on the last step there. Do not use this for professional wildlife-population-estimation until you have fully understood those two words.

The other day my humor blog featured a little table of things for which a “hundred” of them isn’t necessarily 100 of them. It’s just a little bit of wonder I found on skimming the “Index to Units and Systems of Units” page, one of those simple reference sites that is just compelling in how much trivia there is to enjoy. The page offers examples of various units, from those that are common today (acres, meters, gallons), to those of local or historic use (the grosses tausend, the farthingdale), to those of specialized application (the seger cone, used by potters to measure the maximum temperature of kiln). It’s just a wonder of things that can be measured.

There’s a wonderful diversity of commodities for which a “hundred” is not 100 units, though. Many — skins, nails, eggs, herring — have a “hundred” that consists of 120. That seems to defy the definition of a “hundred”, but I like to think that serves as a reminder that units are creations of humans to organize the way we think about things, and it’s convenient to have a unit that is “an awful lot of, but not unimaginably lot of” whatever we’re talking about, and a “hundred” seems to serve that role pretty well. The “hundreds” which are actually 120 probably come about from wanting to have a count of things that’s both an awful lot of the thing and is also an amount that can be subdivided into equal parts very well. 120 of a thing can be divided evenly into two, three, four, five, six, eight, ten, twelve, and so on equal shares; 100 is relatively impoverished for equal subdivisions.

I do not know the story behind some of the more curious hundreds, such as the counting of 106 sheep or lambs as a hundred in Roxburghshire and Selkirkshire (counties in the southeast of Scotland), or the counting of 160 dried fish as a hundred, but it likely reflects the people working with such things finding these to be slightly more convenient numbers than a plain old 100 for the “big but not unimaginably lot of” a thing. The 225 making up a hundred of onions and garlic, for example, seems particularly exotic, but it’s less so when you notice that’s 15 times 15. One of the citations of this “hundred” describes it as “15 ropes and every rope each with 15 heads”. Suddenly this hundred is a reasonable number of things that are themselves reasonable numbers of things.

Of course if they hadn’t called it a “hundred” then I wouldn’t have had a pretty easy comic bit to build from it, but how were they to know the meaning of “hundred” in everyday speech would settle down to an unimaginative solitary value?