Database: SELECT: Difficult Joins

For simple joins, an inner join on the relevant primary/foreign key fields is usually what is needed, and the appropriate comparisons can usually be easily deduced without too much trouble. Many joins only involve two tables, so this shouldn't be too difficult. For more difficult joins, you need to adopt a somewhat more systematic means of creating the proper join. Here is the approach that I use.
  1. Use a relational schema diagram. Even if you're not using Microsoft Access, many SQL programmers create Access relational diagrams to design their tables and show relationships between primary/foreign keys. High-end databases have their tools for this too. Or you can always use paper and pencil.
  2. Choose the SELECT fields (columns). The fields which are selected provides the rough draft of the tables which are needed in the FROM clause, at a minimum.
  3. Which tables are needed can be seen by following the links in the relational diagram to all the tables that are needed. The FROM clause will include all tables that have fields you need in the SELECT column list, plus all tables that are used to link those tables.
  4. Qualify the field (column) names. If field names are not prefixed with some phrase which indicates the table they are in, then it's a good idea to qualify references to them in the SELECT statement for clarity.
  5. Self joins are necessary if when showing or comparing fields from more than one row in a table. For example, if you are showing the names of two sales reps in the same output row, use a self join (and the necessary aliases). For example,
        SELECT Lazy.xxx Busy.xxx
           FROM Salesreps Lazy, Salesreps Busy, ...
  6. Possible additional linkages. There may be other tables that are needed to create the linkages.
  7. Look at keys. Almost all joins use primary and foreign key fields. It's possible to join on other fields, but unusual.
  8. Choosing alternative comparisons. Sometimes there is a choice of which fields to compare to get the same result. If you can, choose fields which are indexed because these are usually faster. Primary keys are the only fields that are usually indexed by default.
  9. Write comments. After each WHERE term write a comment enclosed in /* ... */ which describes what the term does.
  10. Other conditions. Remember that not all the terms in the WHERE clause are part of the join -- there may be some conditions that restrict the number of rows, but are unrelated to joining tables. Write these last to keep them logically separated from the joins.