AWS re/Start Lab · Bases de Datos

Working with Functions

This lab extends the earlier query exercises by introducing aggregate functions, string splitting, string length checks, trimming, and duplicate filtering. The queries still use the world.country table, but the focus is now on transforming and filtering data with functions.

Lab Summary

Connected to the Command Host, reviewed the world database, used aggregate functions to summarize population values, filtered rows with SUBSTRING_INDEX(), cleaned region names with LENGTH(TRIM(...)), removed duplicates with DISTINCT, and solved the challenge by splitting Micronesia/Caribbean into two separate columns.

Aggregate Functions

Calculated global totals and summary values with SUM(), AVG(), MAX(), MIN(), and COUNT().

String Processing

Used SUBSTRING_INDEX(), LENGTH(), and TRIM() to filter records based on parts of a string.

Challenge Output

Split one region value into two separate columns so the result set became easier to read.

Step-by-Step Walkthrough

The lab moved from broad numeric summaries to more targeted string-based filters and then finished with a formatting challenge.

01

Connect and review the world database

  • Opened the Command Host with Session Manager, switched to root, and connected to MariaDB with mysql -u root --password='re:St@rt!9'.
  • Verified that the world database existed with SHOW DATABASES; and reviewed the country table before applying any functions.
  • Ran the aggregate query with SUM(), AVG(), MAX(), MIN(), and COUNT() to summarize population data across all countries.
This first part is useful because it shows the difference between returning individual rows and returning one summarized result generated from many rows.
02

Filter string fragments with SUBSTRING_INDEX

  • Used substring_index(Region, " ", 1) to split the region value at the first blank space.
  • Applied the same function inside the WHERE clause to return only countries where the first word of the region was Southern.
  • Verified that the results included countries from Southern Europe, Southern Africa, and Southern and Central Asia.
03

Use LENGTH, TRIM, and DISTINCT together

  • Ran LENGTH(TRIM(Region)) < 10 to return only region names shorter than ten characters after trimming any leading or trailing spaces.
  • Observed repeated rows in the first result set because several countries share the same short region name.
  • Added DISTINCT(Region) to remove duplicates and keep only the unique region names that match the condition.
04

Solve the challenge by splitting the region value

  • Built a query for the row where Region = "Micronesia/Caribbean".
  • Used substring_index(Region, "/", 1) for Region Name 1 and substring_index(Region, "/", -1) for Region Name 2.
  • Confirmed that the output split the original region string into Micronesia and Caribbean as separate columns.

Query Reference

Main functions used to summarize values, manipulate strings, and clean repeated results.

sql

SUM(Population), AVG(Population), MAX(Population), MIN(Population), COUNT(Population)

Summarizes numeric values across all matching rows.

sql

substring_index(Region, " ", 1)

Returns the text before the first blank space in the region name.

sql

LENGTH(TRIM(Region)) < 10

Checks the length of a string after removing leading and trailing blank spaces.

sql

DISTINCT(Region)

Filters duplicate region names so each one appears only once in the result set.

sql

substring_index(Region, "/", 1)

Returns the part of the region string before the slash.

sql

substring_index(Region, "/", -1)

Returns the part of the region string after the slash.

Key Learnings

What Was Actually Learned

Aggregate functions summarize many rows into one result, which is useful for quick totals and extremes.
String functions can be used not only in the SELECT list, but also inside the WHERE clause to filter records.
DISTINCT is useful when a filter produces repeated values and the goal is to keep a clean unique list.

Technical Conclusion

This lab made SQL feel more practical because the queries were no longer limited to returning stored values exactly as they exist in the table. Instead, the functions transformed the data into forms that were more useful for analysis and presentation.

The challenge was a good final example of that idea. A single region value was turned into two separate output columns, which shows how query functions can reshape the result set without changing the underlying data.