Closed Bug 1435313 Opened 6 years ago Closed 6 years ago

unify and clean up local dev env, -stage, and -prod databases

Categories

(Socorro :: Database, task, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(2 files)

While working on a db audit to figure out where everything is at, I noticed we have tables and stored procedures in the -stage db that don't exist elsewhere, and tables and stored procedures in the -prod db that don't exist elsewhere. This is ... unenthusing.

This bug covers building a list of things that should not be in the -stage db anymore and a list of things that should not be in the -prod db anymore and getting rid of them possibly by throwing them into one big "cleanup" migration.
We're building a new infrastructure which will probably do migrations during deploys automatically. For that to work, it's pretty critical for future migrations to not have relationships with orphaned tables, stored procedures, and other things floating around.

Making this a P1 to do soon.
Blocks: 1361394
Priority: -- → P1
I'm going to use the following to figure things out:

\d+  -- tables, views, and sequences
\df  -- functions
\dT+ -- types

I checked for material views and triggers and we don't have any in any of the environments.

I'm going to assume the local dev environment is correct and anything not in it is either a partition table (which is fine), orphaned (not fine), or accumulated cruft (not fine).

Stage has the following not in local dev environment:

* activity_snapshot
* amd_crash_signatures
* bad_raw_crashes
* crashes_in_es
* djcelery_crontabschedule
* djcelery_crontabschedule_id_seq
* djcelery_intervalschedule
* djcelery_intervalschedule_id_seq
* djcelery_periodictask
* djcelery_periodictask_id_seq
* djcelery_periodictasks
* djcelery_taskstate
* djcelery_taskstate_id_seq
* djcelery_workerstate
* djcelery_workerstate_id_seq
* high_load_temp
* index_bloat
* locks
* locks1
* locks2
* locks3
* modules_module_id_seq
* peterbes_temp_hack_table
* plugins_reports (and partitions)
* product_info_changelog
* product_version_id_seq
* selena_new_reports_duplicates
* seq_reports_id
* south_migrationhistory
* south_migrationhistory_id_seq
* uptime_levels_uptime_level_seq
* (function) backfill_crashes_by_user_oldies
* (function) backfill_nightly_builds
* (function) backfill_one_day
* (function) backfill_signature_summary
* (function) backfill_signature_summary_devices
* (function) backfill_signature_summary_graphics
* (function) check_partitions
* (function) create_table_if_not_exists
* (function) pacific2ts
* (function) pc_weekly_report_partitions
* (function) update_correlations
* (function) update_crashes_by_user_oldies
* (function) update_nightly_builds
* (function) update_reports_clean_cron
* (function) update_signature_summary
* (function) utc_day_near
* (function) watch_report_processing

Stage is missing:

* product_versions_product_version_id_seq

Seems like we've orphaned a bunch of tables over the years. Also, there are a couple of tables that were probably used when fixing something, but never got cleaned up.

A lot of the functions still in -stage were removed in alembic migration 37f7e089210c_bug_1424027_remove_stored_procedures.py. Maybe the signatures weren't the same? Maybe there were typos in the migration? Regardless, I think I did it wrong.
I grabbed a schema of the -stage db, loaded it locally, and went about building a script that we can run with psql that cleans up the -stage db. This is the script:

"""
-- -stage db cleanup script

-- Clean up tables

DROP TABLE "activity_snapshot";
DROP TABLE "amd_crash_signatures";
DROP TABLE "bad_raw_crashes";
DROP TABLE "crashes_in_es";
DROP TABLE "djcelery_periodictask";
DROP TABLE "djcelery_crontabschedule";
DROP TABLE "djcelery_intervalschedule";
DROP TABLE "djcelery_periodictasks";
DROP TABLE "djcelery_taskstate";
DROP TABLE "djcelery_workerstate";
DROP TABLE "high_load_temp";
DROP VIEW "index_bloat";
DROP TABLE "locks";
DROP TABLE "locks1";
DROP TABLE "locks2";
DROP TABLE "locks3";
-- used by "modules" DROP SEQUENCE "modules_module_id_seq";
DROP TABLE "peterbes_temp_hack_table";
DROP TABLE "product_info_changelog";
-- used by "product_versions" DROP SEQUENCE "product_version_id_seq";
DROP TABLE "selena_new_reports_duplicates";
DROP SEQUENCE "seq_reports_id";
DROP TABLE "south_migrationhistory";
-- used by uptime_levels DROP SEQUENCE "uptime_levels_uptime_level_seq";


-- plugins_reports_YYYYMMDD tables as of february 2nd, 2018
DROP TABLE "plugins_reports_20171225";
DROP TABLE "plugins_reports_20180101";
DROP TABLE "plugins_reports_20180108";
DROP TABLE "plugins_reports_20180115";
DROP TABLE "plugins_reports_20180122";
DROP TABLE "plugins_reports_20180129";
DROP TABLE "plugins_reports_20180205";
DROP TABLE "plugins_reports_20180212";
DROP TABLE "plugins_reports";
DELETE FROM "report_partition_info" WHERE table_name = 'plugins_reports';

-- Clean up functions

DROP FUNCTION backfill_crashes_by_user_oldies(date, interval);
DROP FUNCTION backfill_nightly_builds(date);
DROP FUNCTION backfill_one_day();
DROP FUNCTION backfill_signature_summary(date);
DROP FUNCTION backfill_signature_summary_devices(date);
DROP FUNCTION backfill_signature_summary_graphics(date);
DROP FUNCTION check_partitions(text[], integer, OUT integer, OUT text);
DROP FUNCTION create_table_if_not_exists(text, text, text, text[]);
DROP FUNCTION pacific2ts(timestamp with time zone);
DROP FUNCTION pc_weekly_report_partitions(integer, timestamp with time zone);
DROP FUNCTION update_correlations(date, boolean, interval);
DROP FUNCTION update_crashes_by_user_oldies(date, boolean, interval);
DROP FUNCTION update_nightly_builds(date, boolean, interval);
DROP FUNCTION update_reports_clean_cron(timestamp with time zone);
DROP FUNCTION update_signature_summary(date, boolean);
DROP FUNCTION utc_day_near(timestamp with time zone, timestamp without time zone);
DROP FUNCTION watch_report_processing(INOUT integer, OUT integer, OUT interval, OUT interval, OUT interval);
"""

That works locally in regards to correct function signatures and also the order in which things are dropped.

I want to run this manually on the -stage db. If I messed up and we need to recreate anything, we have a very recent dump of the -stage schema we can pull from.

Miles, Mike: Do you see anything wrong with my methodology? Does this look ok to run manually on the -stage db?
Flags: needinfo?(mkelly)
Flags: needinfo?(miles)
Assignee: nobody → willkg
Status: NEW → ASSIGNED
You have a couple lines that start with comments, such as "-- used by uptime_levels DROP SEQUENCE "uptime_levels_uptime_level_seq";" - are those working as intended?

I can't say I'm familiar with these tables, functions, and sequences, but I think this looks good aside from that.

We can take a database snapshot before you run this. Let's coordinate that.
Flags: needinfo?(miles)
The lines that start with -- are commented out.

For example, "uptime_levels" is a table that's not being deleted and it uses "uptime_levels_uptime_level_seq" which is in the list of things that are on -stage that aren't a fresh db. I figured I'd leave these comments in so it's clearer I didn't inadvertently miss things from comment #2.
Looks good to me.
Flags: needinfo?(mkelly)
Since I created the script last week, I had to update it before I ran it so that it dropped "plugins_reports_20180219" as well. I ran that on -stage just now and it went super. I'll attach the complete script.
I'll let -stage sit for a day and figure out -prod tomorrow.
Using the same method I used in -stage:

1. create a "build a schema" script from the -prod db so I can test the .sql file locally
2. go what's in the db of a fresh local dev environment and what's in the -prod db and make a list of differences in tables, functions, and types
3. write a .sql script to drop the relevant differences from -prod
4. test the script in my local dev environment in a db built using the "build a schema" script from -prod

I came up with this:

"""
-- -prod db cleanup script for bug #1435313

-- Clean up tables

DROP TABLE "activity_snapshot";
DROP TABLE "amd_crash_signatures";
DROP TABLE "bad_raw_crashes";
DROP TABLE "crashes_in_es";
DROP TABLE "djcelery_periodictask";
DROP TABLE "djcelery_crontabschedule";
DROP TABLE "djcelery_intervalschedule";
DROP TABLE "djcelery_periodictasks";
DROP TABLE "djcelery_taskstate";
DROP TABLE "djcelery_workerstate";
DROP VIEW "index_bloat";
DROP TABLE "locks";
DROP TABLE "locks1";
DROP TABLE "locks2";
DROP TABLE "locks3";
DROP TABLE "product_info_changelog";
DROP TABLE "selena_new_reports_duplicates";
DROP SEQUENCE "seq_reports_id";
DROP TABLE "south_migrationhistory";

-- plugins_reports_YYYYMMDD tables as of February 7th, 2018
DROP TABLE "plugins_reports_20171225";
DROP TABLE "plugins_reports_20180101";
DROP TABLE "plugins_reports_20180108";
DROP TABLE "plugins_reports_20180115";
DROP TABLE "plugins_reports_20180122";
DROP TABLE "plugins_reports_20180129";
DROP TABLE "plugins_reports_20180205";
DROP TABLE "plugins_reports_20180212";
DROP TABLE "plugins_reports";
DELETE FROM "report_partition_info" WHERE table_name = 'plugins_reports';

-- Clean up functions

DROP FUNCTION backfill_crashes_by_user_oldies(date, interval);
DROP FUNCTION backfill_nightly_builds(date);
DROP FUNCTION backfill_one_day();
DROP FUNCTION backfill_signature_summary(date);
DROP FUNCTION backfill_signature_summary_devices(date);
DROP FUNCTION backfill_signature_summary_graphics(date);
DROP FUNCTION check_partitions(text[], integer, OUT integer, OUT text);
DROP FUNCTION create_table_if_not_exists(text, text, text, text[]);
DROP FUNCTION is_rapid_beta(citext, text, text);
DROP FUNCTION pacific2ts(timestamp with time zone);
DROP FUNCTION pc_weekly_report_partitions(integer, timestamp with time zone);
DROP FUNCTION socorro_db_data_refresh(timestamp with time zone);
DROP FUNCTION update_correlations(date, boolean, interval);
DROP FUNCTION update_crashes_by_user_oldies(date, boolean, interval);
DROP FUNCTION update_nightly_builds(date, boolean, interval);
DROP FUNCTION update_reports_clean_cron(timestamp with time zone);
DROP FUNCTION utc_day_near(timestamp with time zone, timestamp without time zone);
DROP FUNCTION watch_report_processing(INOUT integer, OUT integer, OUT interval, OUT interval, OUT interval);
"""

It's much like the one in -stage, but with some minor differences.

Lonnen: Does the methodology for how we built this seem sound? Does the script look ok?
Flags: needinfo?(chris.lonnen)
I had a dream last night. Lonnen and I were sitting in a diner. It was dark outside, but I'm not sure what time it was. There was no one else in the diner. The diner is a mish-mosh of pink and blue with silver edging. One of the lights is flickering. I think it's the same rhythm as Another One Bites the Dust, but I'm not sure.

Lonnen looks me in the eye and says, "You should get the pancakes."

I'm confused because I'm not sure why he's telling me this. Then he leans in a little closer and says more intensely, "You should get the pancakes."

I'm puzzled, but I figure that I like pancakes, so why not. I look around for wait staff or to figure out if I'm supposed to go to the counter and order.

Then Lonnen puts his hands on the table and says, "The pancakes, Kahn-Greene. Get. The. Pancakes."

I go up to the counter and order the pancakes. I don't actually talk to anyone, but I figure if I say it out loud then maybe that satisfies Lonnen somehow. Maybe the pancakes will come if I hope hard enough. I go back to my seat in our booth.

Lonnen looks at me and says. "Good job." Then he *looks* at me and he continues to sit with his hands on the table, but he's moving away from me as the table suddenly grows in length in an exponential fashion. Then he disappears off into the distance. 

I'm now alone in the diner. I sit for a while. The pancakes don't come.

Anyhow, long story short, Lonnen said via IRC that he's fine with the script in comment #10.
Flags: needinfo?(chris.lonnen)
I ran the script in -prod just now. It went super. I attached the script.

Everything is done here, so I'm marking as FIXED.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
+1. anyhow, here's wonderwall.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: