As another example of how data is stored and manipulated in the computer, we'll look at "table data" -- a common a way to organize strings, numbers, dates in rectangular table structure. In particular, we'll start with data from the social security administration baby name site.
Social Security Baby Name Example
Names for babies born each year
Organized as a table
Fields: name, rank, gender, year
Rows: one row holds the data for one name
The table is made of 2000 rows, each row represents the data of one name
Each row is divided into 4 fields
Each of the 4 fields has its own name. The field names are: name, rank, gender, year
Tables Are Very Common
Tables are a very common structure for computer data
Number of fields is small (categories)
Number of rows can be millions or billions
e.g. email inbox: one row = one message, fields: date, subject, from, to, ...
e.g. craigslist: one row = one thing for sale: description, price, seller, listing date, ...
Much of the information stored on computers uses this table structure. One "thing" we want to store -- a baby name, someone's contact info, a craigslist advertisement -- is one row. The number of fields that make up a row is fairly small -- essentially the fixed categories of information we think up for that sort of thing. For example one craigslist advertisement (stored in one row) has a few fields: a short description, a long description, a price, a seller, ... plus a few more fields.
The number of fields is small, but the number of rows can be quite large -- thousands or millions. When someone talks about a "database" on the computer, that builds on this basic idea of a table. Also storing data in a spreadsheet typically uses exactly this table structure.
Table Code
We'll start with some code -- SimpleTable -- which will serve as a foundation for you to write table code. Run the code to see what it does.
Baby data stored in "baby-2010.csv"
--".csv" stands for "comma separated values" and it is a simple and widely used standard format to store a table as text in a file.
The for-loop works to loop over the rows in a table "for (row: table) { ... }"
-- analogous to old "for (pixel: image)" to see every pixel in an image
print(row) prints out the fields of a row on one line
Table Query Logic
Write if-statement to select the rows we want
Database terminology - writing a "query" on the database
row.getField("field-name") returns the data for one field out of the row
Compare two values with == (two equal signs)
e.g. (row.getField("name") == "Alice")
Warning: single equal sign = does variable assignment, not comparison. Use == inside if-test.
Other comparisons: < > <= >=
Example: for-loop containing if-statement to choose the row we want
The interesting thing to do is write some "query" logic where we pick out just the rows we care about.
The row object has a row.getField("field-name") function which returns the data for one field out of the row. Each field has a name -- one of "name" "rank" "gender" "year" in this case -- and the string name of the field is passed in to getField() to indicate which field we want, e.g. row.getField("rank") to retrieve the rank out of that row.
You can test if two values are equal in JavaScript with two equal signs joined like this: ==. Using ==, the code to test if the name field is "Alice" is row.getField("name") == "Alice"
We can write an if-statement within the for-loop to test for certain rows. For each row, this code tests if the name == "Alice", pulling out and printing that one row:
Note that a single equal sign = does assignment to a variable and not comparison. It's a common mistake to type in one equal sign for a test, when you mean two equal signs. Unfortunately, JavaScript does not flag this error, so you have to look at your code and notice it. The regular less-than/greater-than type tests: < > <= >= work as have seen before.
Table Query Examples
Write in code above to solve these problems:
name field is "Alice", "Robert", "Bob", "Abby", "Abigail" (try each in turn, yes nobody names their child "Bob" .. apparently always using Robert or Bobby)
rank field is 1
rank field is < 20
rank field is >= 990
gender field is "girl"
What is going on for all these: the loop goes through all 2000 rows and evaluates the if-test for each, printing if the test is true and otherwise not printing anything.
Solution code:
If logic inside the loop:
if (row.getField("name") == "Alice") {
print(row);
}
// Change string to "Robert", "Bob", etc.
if (row.getField("rank") == 1) {
print(row);
}
if (row.getField("rank") < 20) {
print(row);
}
if (row.getField("gender") == "girl") {
print(row);
}
s.startsWith("hi") s.endsWith("y")
String tests: s.startsWith("A") s.endsWith("y")
-- e.g. row.getField("name").startsWith("A")
Not in Javascript, added for CS101
Very handy for the baby name data
True/false tests of string s:
s.startsWith("Ab") -- true if s starts with "Ab"
s.endsWith("z") -- true if s sends with "z"
For our purposes, strings support a s.startsWith("Ab") function, here testing if the string in the variable s starts with the "Ab" .. true or false. Likewise, there is s.endsWith("yz"), here testing if the string in variable s has "yz" at its very end. (Sadly, these two functions are not part of standard JavaScript; I made them work just for out code because they are so useful. These two functions are common in other computer languages.)
These tests work very well with the name strings pulled out of the baby data. Here we can look at all the names beginning with "Ab".
Variants to try above
name field starts with "Ab", "A", "a" (lower case), "Z", "Za" (each in turn)
name field ends with "z", "ly", "la" (each in turn)
Solution code:
If logic inside the loop:
if (row.getField("name").startsWith("Ab")) {
print(row);
}
// Change string to "A", "a", "Z", .. each in turn
if (row.getField("name").endsWith("z")) {
print(row);
}
Boolean Logic: && || !
Boolean logic -- combine tests with and/or/not logic
&& and (two ampersands)
|| or (two vertical bars)
! not (exclamation mark)
Sorry this is a bit cryptic -- historical syntax accident
Boolean Logic Example 1
if (x.startsWith("A") && x.endsWith("z")) { ...
Assume x is some string we want to test
Above true only if x starts with "A" and x ends with "z"
Upper and lower case "A" and "a" are different for the tests
Standalone rule -- note that the tests joined by && || are syntactically complete tests on their own, then joined with && or ||
Incorrect:x.startsWith("A") && endsWith("z")
CS101 checking -- for our code, all the syntax parts of the if-statement are required: if (test) { ... }
CS101 tries to detect certain common errors, like omitting the {, or typing & instead of &&, giving an error message
Boolean Logic Example 2
if (x.startsWith("A") || x.endsWith("z")) { ...
Above is true if x starts with "A" or x ends with "z" (or both)
e.g. x = "Abby", test is true
e.g. x = "fez", test is true
e.g. x = "abc", test is false
Boolean Logic Example 3
Here's the if-test for the query: popular names beginning with A (i.e. name starts with A and rank <= 50). Instead of x, now use the actual row.getField("name") to get the string out of the row.
In this case the test has been broken across multiple lines. You can break the test across multiple lines to avoid having so much code all on one line. Below the second and later lines are indented to line up with the first line for neatness (not required). Note that the if-statement still has the structure: if (test) { ..code.. } but with the test spanning multiple lines. CS101 will warn about common syntax errors for if-statements: omitting (test) parenthesis, curly braces, typing = instead of ==, typing & instead of &&, typing | instead of ||. In professional Javascript, things like a single | are allowed in rare cases, but for CS101 we'll figure that was just a typo.
if (row.getField("name").startsWith("A") &&
row.getField("rank") <= 50) { ...
Now add the constraint that they must be girl names: