Search This Blog

Wednesday, August 27, 2014

Scottish Referendum

A quick look at the density / spatial pattern from geotweets (i.e. those from smartphones mainly)… filtered on keywords as detailed in the image.

Mainly North of the border then….



SQLITE to CSV on Mac

Open Terminal

cd to directory with sqlite database in it

 sqlite3 dbname

sqlite> .headers on
sqlite> .mode csv
sqlite> .output test.csv
sqlite> select * from tbl1;
sqlite> .output stdout


Monday, August 25, 2014

Big-O notation notes

Big-O notation is a relative representation of the complexity of an algorithm. These notes taken from these 3 sources:

  • 1 item: 1 second
  • 10 items: 100 seconds
  • 100 items: 10000 seconds
Notice that the number of items increases by a factor of 10, but the time increases by a factor of 102. Basically, n=10 and so O(n2) gives us the scaling factor n2 which is 102.
  • 1 item: 1 second
  • 10 items: 10 seconds
  • 100 items: 100 seconds
This time the number of items increases by a factor of 10, and so does the time. n=10 and so O(n)'s scaling factor is 10.
O(log n):
  • 1 item: 1 second
  • 10 items: 2 second
  • 100 items: 3 seconds

  • 1 item: 1 second
  • 10 items: 1 second
  • 100 items: 1 second


If we add two 100 digit numbers together we have to do 100 additions. If we add two 10,000 digit numbers we have to do 10,000 additions.
See the pattern? The complexity (being the number of operations) is directly proportional to the number of digits n in the larger number. We call this O(n) or linear complexity.
Now if you were instructing a computer to look up the phone number for "John Smith" in a telephone book that contains 1,000,000 names, what would you do? Ignoring the fact that you could guess how far in the S's started (let's assume you can't), what would you do?
A typical implementation might be to open up to the middle, take the 500,000th and compare it to "Smith". If it happens to be "Smith, John", we just got real lucky. Far more likely is that "John Smith" will be before or after that name. If it's after we then divide the last half of the phone book in half and repeat. If it's before then we divide the first half of the phone book in half and repeat. And so on.
This is called a binary search and is used every day in programming whether you realize it or not.
So if you want to find a name in a phone book of a million names you can actually find any name by doing this at most 20 times. In comparing search algorithms we decide that this comparison is our 'n'.
  • For a phone book of 3 names it takes 2 comparisons (at most).
  • For 7 it takes at most 3.
  • For 15 it takes 4.
  • For 1,000,000 it takes 20.
That is staggeringly good isn't it?
In Big-O terms this is O(log n) or logarithmic complexity. Now the logarithm in question could be ln (base e), log10, log2 or some other base. It doesn't matter it's still O(log n) just like O(2n2) and O(100n2) are still both O(n2).
It's worthwhile at this point to explain that Big O can be used to determine three cases with an algorithm:
  • Best Case: In the telephone book search, the best case is that we find the name in one comparison. This is O(1) or constant complexity;
  • Expected Case: As discussed above this is O(log n); and
  • Worst Case: This is also O(log n).


Polynomial Time

Another point I wanted to make quick mention of is that any algorithm that has a complexity of O(na) is said to have polynomial complexity or is solvable in polynomial time.
O(n), O(n2) etc are all polynomial time. Some problems cannot be solved in polynomial time. Certain things are used in the world because of this. Public Key Cryptography is a prime example. It is computationally hard to find two prime factors of a very large number. If it wasn't, we couldn't use the public key systems we use.

Monday, August 4, 2014

Flyover of Chambers Street in Edinburgh

Flyover of Chambers Street on a Digital Surface Model built from a LIDAR dataset.