...
This is a table called Students:
ID | Surname | Forename | Country | Course | FTE |
---|---|---|---|---|---|
1 | Smith | John | 02 | Biology | 1.0 |
2 | Cooper | Brian | 05 | Politics | 1.0 |
3 | Wertz | Heidi | 14 | Biology | 0.5 |
4 | Goodman | Pete |
| Archaeology | 1.0 |
Query
To get this data, we run the following SQL query:
Code Block |
---|
SELECT ID, Surname, Forename, Country, Course, FTE
FROM Students
|
...
The SELECT keyword is what you use to filter your data by column, so if you're just after department course and ID:
Code Block |
---|
SELECT ID, Surname,Course Forename FROM Students |
gives:
ID | Course |
---|---|
1 | Biology |
2 | Politics |
3 | Biology |
4 | Archaeology |
WHERE
To filter by row, we need a new keyword: WHERE.
Code Block |
---|
SELECT ID, Course FROM Students WHERE Course = "Biology" |
gives:
ID | Course |
---|---|
1 | Biology |
3 | Biology |
If you want to be more restrictive, you can provide multiple WHERE conditions:
...
If you want to sort your results, use the ORDER BY clause.
Code Block |
---|
SELECT Surname, Forename
FROM Students
WHERE Course = "Biology"
ORDER BY Surname, Forename
|
...
The default sort order is sensible. If it doesn't suit, you can use ASC or DESC:
Code Block |
---|
SELECT Surname, Forename
FROM Students
WHERE Course = "Biology"
ORDER BY Surname DESC, Forename ASC
|
...
To aggregate data, use GROUP BY:
Code Block |
---|
SELECT Course
FROM Students
GROUP BY Course
|
...
This isn't usually what you want - aggregation is about counting/summing things. We need to create a new column:
COUNT
Code Block |
---|
SELECT Course, COUNT(ID)
FROM Students
GROUP BY Department
|
...
If instead you want to total up a number column, use SUM. To relabel columns, use AS.
Code Block |
---|
SELECT Course, SUM(FTE) AS TotalFTE, SUM(1) AS TotalStudents
FROM Students
GROUP BY Department
|
Course | ||
---|---|---|
Biology | 1.5 | 2 |
Archaeology | 1 | 1 |
Politics | 1 | 1 |
...
We have a code in our Students table so we need to relate it to its lookup, Country:
02 | England |
05 | France |
06 | Wales |
14 | Germany |
We relate tables using JOIN, which comes in a couple of varieties:
INNER JOIN
Code Block |
---|
SELECT Students.Surname, Students.Forename, Country.CountryName
FROM Students
INNER JOIN Country ON Students.Country = Country.CountryCode
|
gives:
Surname | Forename | |
---|---|---|
Smith | John | England |
Cooper | Brian | France |
Wertz | Heidi | Germany |
The INNER JOIN part says we want to return rows where a match exists in both tables, using the match criteria of Students.Country = Country.CountryCode. The first part is the table name, the second is the column name. We don't get a result for Pete Goodman because his country is blank, and there isn't a valid lookup row in the Country table.
...
Left join gets all the values from the first table, and any rows that match from the second:
Code Block |
---|
SELECT Students.Surname, Students.Forename, Country.CountryName
FROM Students
LEFT JOIN Country ON Students.Country = Country.CountryCode
|
gives:
Surname | Forename | |
---|---|---|
Smith | John | England |
Cooper | Brian | France |
Wertz | Heidi | Germany |
Goodman | Pete |
|
Now it's clearer that you have all your students showing, and one has a missing value you need to look into.
...
You can use queries in place of a table in the FROM clause which lets you do more complex querying in one statement - say you only want to get countries which have more than 100 students - first, do the sum:
Code Block |
---|
SELECT Country, SUM(1) AS TotalStudents
FROM Students
GROUP BY Country
|
gives:
Country | |
---|---|
02 | 1 |
03 | 2 |
19 | 450 |
16 | 103 |
Then wrap this in brackets and use it in your FROM clause:
Code Block |
---|
SELECT Country.CountryName, StuCount.TotalStudents
FROM
(
SELECT Country, SUM(1) AS TotalStudents
FROM Students
GROUP BY Country
) StuCount
LEFT JOIN Country ON Students.Country = Country.CountryCode
WHERE StuCount.TotalStudents >= 100
|
gives:
Bulgaria | 450 |
Micronesia | 103 |
Nesting is a powerful way to take oodles of big data, perform a sequence of steps on it, in order to squish it down into exactly the simple form you want. It's also a superfast way to get lost in onion layers of logic and make small mistakes that are obscured until a few years later when something new crops up and breaks things.
...
This is about as far as you can get with the standard SQL syntax. Despite agreements all round, database vendors have decided they know best and have their own implementation of things that are so common across the world that you sometimes can't believe it. Say you want the last character of a value (we often do, as we keep programme length in this place of many codes).
Microsoft SQL Server / MySQL
Code Block |
---|
SELECT RIGHT(ProgrammeCode,1) |
...