Database Trimming
In this article:
This article provides best practices for trimming the CloudShell databases. It is important to periodically trim the databases to minimize load and optimize performance. CloudShell uses several SQL and MongoDB databases for data storage and retrieval.
The databases that may cause issues from time to time are:
- Quali DB: SQL database that stores sandbox deployment and lifecycle management data
- Insight DB: SQL database that stores Insight data
- MongoDB: MongoDB database that stores sandbox Activity Feed and Output events
- Keep a backup copy of the production database before defragmenting/trimming.
- As a best practice, we recommend doing the following:
- To allow CloudShell to work uninterrupted while you trim/defragment the database, run the procedure on a separate SQL Server. To do so, back up the database, restore it on another SQL Server and run the script on the restored database. When the operation completes, restore the updated database in the production SQL Server.
- If you continue working with CloudShell while the database is being trimmed/defragmented, you may lose the new data when restoring back the database. To avoid losing new data, set a Maintenance Window for the duration of the operation. To get an idea of the time it will take, please perform the defragmentation/trimming procedure on a copy of the database. The operation may take several hours, depending on the size of the database.
Defragment Quali / Insight DBs
We recommend defragmenting the Quali DB on a weekly/monthly schedule or when the fragmentation reaches a certain fragmentation threshold, such as 30%.
To check the fragmentation level of a database:
- In the script below, set the following parameters
- [Quali]: Replace with the name of the database (leave the square brackets).
- DDIPS.avg_fragmentation_in_percent: Set the fragmentation threshold percentage (default is 0)
USE [Quali] SELECT S.name as 'Schema', T.name as 'Table', I.name as 'Index', DDIPS.avg_fragmentation_in_percent, DDIPS.page_count FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS DDIPS INNER JOIN sys.tables T on T.object_id = DDIPS.object_id INNER JOIN sys.schemas S on T.schema_id = S.schema_id INNER JOIN sys.indexes I ON I.object_id = DDIPS.object_id AND DDIPS.index_id = I.index_id WHERE DDIPS.database_id = DB_ID() and I.name is not null AND DDIPS.avg_fragmentation_in_percent > 0 ORDER BY DDIPS.avg_fragmentation_in_percent desc
To rebuild a database:
-
In the script below, replace [Quali] with the name of the database (leave the square brackets) and run the script.
USE [Quali] GO DECLARE @TableName VARCHAR(255) DECLARE @sql NVARCHAR(500) DECLARE @fillfactor INT SET @fillfactor = 80 DECLARE TableCursor CURSOR FOR SELECT OBJECT_SCHEMA_NAME([object_id])+'.'+name AS TableName FROM sys.tables OPEN TableCursor FETCH NEXT FROM TableCursor INTO @TableName WHILE @@FETCH_STATUS = 0 BEGIN SET @sql = 'ALTER INDEX ALL ON ' + @TableName + ' REBUILD WITH (FILLFACTOR = ' + CONVERT(VARCHAR(3),@fillfactor) + ')' EXEC (@sql) FETCH NEXT FROM TableCursor INTO @TableName END CLOSE TableCursor DEALLOCATE TableCursor GO
Trim Insight DB of old data
This database doesn't have a retention policy, which may cause unnecessary load in highly active CloudShell deployments over a long enough period. As such, we recommend trimming this database for unneeded data every 1-2 years.
For example, this particular script clears data that is older than 12 months from the specified Insight DB.
Note: To avoid loss of data, make sure the trimming process is completed before the Insight DB's nightly sync with CloudShell. For details, see Synchronization.
To trim the database using the above script:
-
Open the script in an text editor and set the following parameters:
- DB_NAME: Enter the name of the relevant database. There are 3 places within the script.
- @months_keep: Set the data retention period, in months (older data will be purged). Default is 12.
- Save and run the script.
-
To follow the progress of the script, you can run the following and see the row count go down on relevant tables:
SELECT o.NAME, i.rowcnt FROM sysindexes AS i INNER JOIN sysobjects AS o ON i.id = o.id WHERE i.indid < 2 AND OBJECTPROPERTY(o.id, 'IsMSShipped') = 0 ORDER BY i.rowcnt desc --ORDER BY o.name
-
To follow the progress of the DB trimming operation (which runs at the end of the script), run the following:
SELECT percent_complete, start_time, status, command, estimated_completion_time, cpu_time, total_elapsed_time FROM sys.dm_exec_requests WHERE command = 'DbccFilesCompact'
Reduce retention period for Mongo DB data
By default, MongoDB keeps sandbox Activity Feed/Output event data for six months. Depending on your CloudShell deployment, you may need to reduce the retention period. For details, see Limiting the time to keep sandbox Activity Feed/Output data.
Tip: You do not need to set a maintenance window for this action.