Algorithms OR Data Structures

Submitted by nvovn on Wed, 03/22/2017 - 15:36

At the verge of a non existing extra day caused by a leap year (2017 is not a leap year) I would like to remind you of Kernighan & Ritchie's famous book on the language C. Therein you learn how to correctly implement the leap year algorithm, as found in Wikipedia:

if (year is not divisible by 4) then 
  (it is a common year) 
else if (year is not divisible by 100) then 
  (it is a leap year) 
else if (year is not divisible by 400) then 
  (it is a common year) 
  (it is a leap year)

Although you can directly implement it in 10 lines of code (which could trigger another blog post on the failure of the Lines Of Code KPI paradigm...), as a "real programmer" you would implement it like so:

isLeapYear = (year % 4 == 0) && (year % 100 != 0)) || (year % 400 == 0);

This is oh so wrong. Why? Because it uses a (simple) algorithm instead of thinking about the best data structure. And bear in mind even your smart phone has more then some GB of memory.

Consider an associative array holding the relevant data (from 2000 to 2100), here in PHP code:

a['2000'] = {'leap': true};

How would you check, if the year 2000 is a leap year? Via

if (a['2000']=>'leap') then 
  print 'is leap year\n';

What you now see is that you exchanged computing power (three modulo evaluations plus two logical expressions) by storage space (100 entries) and one array evaluation. Funny enough this technique perfectly fits to the time dimension idea in DWH concepts. 

Neat, isn't it?

BTW: The blog title is referring to Nicolaus Wirth's book called "Algorithms and Data Structures". This book is brilliant, but the title is clearly misleading...