A cheklist of SQL server performance consideration for architecting/designing systems that deal with very large volumes of data.They are in no particular order. SQL performance tuning is necessary to optimise a SQL Server database.
Careful researching and planning is required to design and manage a write-intensive SQL Server database. Test all SQL Server configuration changes, before moving into Production. Consider the implications of making changes.
*Run as many load processes as you have available CPUs. If you have 8 CPUs, run 8 parallel loads. If you have 2 CPUs, run 2 parallel loads.BULK INSERT pubs..authors2 FROM 'c:\authors.txt'
WITH (
DATAFILETYPE = 'char',
FIELDTERMINATOR = ',',
TABLOCK
)
The TABLOCK can also be used in the
INSERT INTO myTable1 SELECT * FROM myTable2 WITH(TABLOCK) type of statement
*Use ROWS PER BATCH = 2600, or something near this if you are importing multiple streams into one table.Use ROWS PER BATCH=2600 , primarily to avoid lock escalation
*Place database in BULK LOGGED mode .This records extent allocations and related metadata , but not the individual page changes and not the individual record inserts.
*In the case of row-versioning, indexing ,temporary objects there is a greater dependance on TempDB. In those case RAID 10 is the way to go on TempDB. There are cost implications , and if so use RAID 0. If there isn't a lot of write-activity RAID 5 will suffice. SQL Server – tempdb and solid state drives
* Multiple writing to the data aspect of the database can be done concurently by using multiple disk drives.A major consideration is striping data over multiple disk drives.RAID 10 is the optimal . RAID 10 mirrors every drive and stripes the data. At min : RAID 1. RAID 5 is better suited to read-only environment. I will post some stats of results in comparison
* SQL Server writes to the transaction log file in a synchronous mode.Therefore fully logged changes is generally limited by the log writing activity.Taking this one step further, if the writes to the transaction log reach the disk throughput, the data modifications will have to be limited by the log writes.One approach is to split the the db across multiple dbs , all with separate log files , all on separate hard drives. Logging is often the most crirical performance factor in an intense INSERT environment. Aim to have the log files on a dedicated hard disk
* The nature of the application may have to dictate what type of INSERT process is required. Typically there are 3 INSERT approaches. 1)BULK 2)multirow INSERT such as INSERT SELECT 3)individual INSERT. The order is outlined is the order of efficiency , with 1) being the most efficient. The level of logging is the key differentiator for the above 3 methods.
* Even though BULK INSERTS are the most efficient , i.e place a number of INSERTS within a single transaction , there isn't an objective number that is optimal. A degree of testing is required. I will post notes , on a given hardware,indexing architecture as to results I've experienced. An effective approach is benchmarking , increasing or decreasing the amount of rows per BULK INSERT.
* Dealing with very large tables and manipulating is primarily not about space, but considering whether to delete rows or truncate / drop tables. Regardless of whether there is a clause involved in the DELETE statement , a DELETE statement when dealing with large data tables is expensive. Truncating/Dropping is inexpensive in comparison .
For example, if you have 1,000,000 table rows at 2,000 bytes each , you are looking at 2gb space.
i.e approx , 1,000,000 rows x 2,000bytes = 2,000,000,000 bytes = 2 GB
* Hardware considerations- speak to your Hardware engineers
Source:Jack Vamvas (http://www.sqlserver-dba.com)
This is only a preview. Your comment has not yet been posted.
As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.
Having trouble reading this image? View an alternate.
Posted by: |