#100DaysOfCode - Day 7: SQL Relations, Joins
Today, I got to work a good bit with SQL tables, reaqaint myself with (nested)
JOINS, and the
In SQL, you can have a table’s field (or column with a title, in terms of excel tables) referenced by other tables, and that creates a link to that table from that other table based on that ‘key’. The other table’s reference to the main table’s column is called a
foreign key (officially), and SQL has built in procedures to keep you from deleting those elements to keep data from getting fragmented, excepting the case where you remove the constraints. I’ll note, the best practice to delete data that’s spread across tables like this is NOT to remove the constraint, but rather to mark the data as ‘dead’ via a column that has a TRUE/FALSE value to declare it as such, but I’ll try not to tangent from my experiences today!
In this case, I was working with test data, and had two tables that had this similar relationship between tables via a joined column, but it was all done OUTSIDE of SQL, meaning there was no connection that the database was aware of. There was a field in a table A, which was SOFT referenced in other tables, so it did not have that restriction, and it made deleting the field from table A possible, which was great, until we realized those other tables were causing errors.
In trying to chase down what those stray values were, I got to use the following, VERY standard SQL command, to find those stray values
DELETE * FROM b WHERE b.aID is NOT IN (SELECT * FROM a.ID) /* NOTE: b.aID is the name that WE specified b to have for the column that matches a.ID. This could be named anything, but we chose aID to make it evident that it's a foreign key pointing to A's primary key */
… and ultimately delete them =)
Now, occasionally, you may need to get a bit more complex, and in my case it was admittedly a bit more complex than the previous scenario, because there were more tables stemming off of the table
b. In cases like the one I ran into, it’s important to be very familiar with
SQL JOINS, that select (and thus delete) columsn based on those relations I talked about previously.
JOINS, you need to specify the data you want to
SELECT, tell it what table to
JOIN with, and specify what column you want to join
ON. While I like to use parentheses syntax to specify operations, it’s not really necessary, as joins run in order like arithmetic.
Here’s an example of the two different styles, which from my (rusty) understanding, should be functionally equivalent.
/* WITH PARENTHESES */ SELECT * FROM (SELECT * FROM a LEFT JOIN b ON a.Id = b.aID) ab LEFT JOIN c ON ab.cID = c.Id /* WITHOUT PARENTHESES */ SELECT * FROM a LEFT JOIN b ON a.Id = b.aID LEFT JOIN c ON a.cID = c.Id
Here’s a summary of the primary types of joins as well!
INNER JOIN: Joins two tables on a specified column and returns only items that BOTH have
LEFT JOIN: Joins two tables on a specified column and returns all rows from the first table, with matching rows from the 2nd table where they match the first table.
OUTER JOIN: Joins two tables on a specified column and returns all the rows from BOTH tables (even non-matching rows) leaning up tables with foreign keys that had source deleted.
Tomorrow, if I’m able to take the time, I’ll try to dive into some of the details of these
JOINS, and how they tie in with the
DELETE (rather than the