2017-09-12
1,033 reads
2017-09-12
1,033 reads
When we have to deal with and store a lot of data, it makes sense to aggregate it so that we store only the information we actually need. If we get this right, this works well, but the design of the system takes care and thought because the problems can be subtle and various. Joe Celko describes some of the ways that things can go wrong and end up providing incorrect, inaccurate or misleading results.
2017-09-12
3,866 reads
2017-09-05
1,103 reads
Phil Factor shows how to use the Mann-Whitney U test in SQL to to find out whether two samples come from the same distribution.
2017-09-04
4,456 reads
2017-08-31
1,150 reads
2017-08-22
1,202 reads
Before you report your conclusions about your data, have you checked whether your 'actionable' figures occurred by chance? The Kruskal-Wallis test is a safe way of determining whether samples come from the same population, because it is simple and doesn't rely on a normal distribution in the population. This allows you a measure of confidence that your results are 'significant'. Phil Factor explains how to do it.
2017-07-27
6,123 reads
Technical debt is a real problem in database development, where corners have been cut in the rush to keep to dates. The result may work but the problems are in the details: such things as inconsistent naming of objects, or of defining columns; sloppy use of data types, archaic syntax or obsolete system functions. With databases, technical debt is even harder to pay back. Robert Sheldon explains how and why you can get it right first time instead.
2017-07-25
5,860 reads
User-Defined Functions (UDFs) are an essential part of the database developers' armoury. They are extraordinarily versatile, but just because you can even use scalar UDFs in WHERE clauses, computed columns and check constraints doesn't mean that you should. Multi-statement UDFs come at a cost and it is good to understand all the restrictions and potential drawbacks. Phil Factor gives an overview of User-defined functions: their virtues, vices and their syntax.
2017-07-21
5,686 reads
If the design of a relational database is wrong, no amount of clever DML SQL will make it work well. Dr. Codd’s Information Principle is that you have, inside the entity tables, the columns that model the attributes of that entity. The columns contain scalar values. Tables that model relationships can have attributes, but they must have references to entities in the schema. You split those attributes at your peril. Joe Celko explains the basics.
2017-07-18
3,822 reads
By DataOnWheels
I have been active in the data community throughout my career. I have met...
By Vinay Thakur
Quick Summary for Microsoft SQL Server till 2025, I am fortunate to be part...
By James Serra
Why this comparison feels confusing If you’re a Power BI report author who’s just...
Hello, Is there a way in Azure SQL Database to change the 'Blocking Process...
Comments posted to this topic are about the item Having a Little Fun at...
Comments posted to this topic are about the item Designing SQL Server Pipelines That...
On SQL Server 2025, when I run this, what is returned?
SELECT EDIT_DISTANCE_SIMILARITY('SQL Server', 'MySQL') See possible answers