The Scientific Method: a call to action

Question

The Scientific Method: a call to action

Viewing 15 posts - 16 through 30 (of 169 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 1

meilenb (5/23/2015)
I don't require you to agree with me.
I don't require you to be right either.

Heh... remember that.;-)

meilenb (5/24/2015)
But if you don't get the product out the door you will never have any money to fix it later. Successful people don't wait around while scientists argue the merits of their experiments.

It's funny how people think that. If you get the product out the door and it's broken, you're going to need a whole lot more money to fix it than if you did it right the first time. The hacks on several large corporations in the last two years are fine testimony to that.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 2

TomThomson (5/23/2015)
That's a great editorial, and I totally agree with it. If everyone based design and build of systems only on verified (or at least adequately tested) hypotheses we would suffer a lot less from bugs and other unintended behaviour, the cost of software/system development and support would be significantly reduced, and it is really crazy to push out assertions without any supporting evidence, which ultimately has to be text and measurement based. Unfortunately, while many developers, dbas, and engineers understand that, there are a lot of managers out there who either have forgotten it or never understood it and drive their staff into forgetting it.
But the scientific method doesn't generally lead to proofs - the hypotheses are falsifiable by test (or at least they were until string theorists got hold of physics) but a failure to falsify merely means that the predictions the hypothesis leads to are good for certain experiments and only within the limits of accuracy of measurement, not that it's some sort of absolute truth. Many experiments may be needed to give a decent degree of confidence, and even after that we may hit circumstances where the hypothesis turns out not to fit - for example Newtonian mechanics survived expirement for centuries, but is now known to be false (although it's still valid for many purposes) because it doesn't work for things on the atomic or smaller scale and it doesn't work on the astronomical scale (gets the orbit of Mercury wrong, for example).
Yes, I prefer to prove things - I'm still a mathematician at heart; and yes, I can prove some statements about performance because they may be simple statements of pure mathematics. But there are far more performance statements where all I can prove is that there's a decent probability that a particluar algrithm will give better performance than another, because which performs best often depends on what data is fed to them and what environment they run in, and in some cases I can't even do that to any useful extent (in which case I'd prefer not to produce any software for that problem). But I can formulate hypotheses and test them and publish the hypotheses and the tests even though by doing that I get no proof of correctness, only either a better degree of confidence or a proof of incorrectness - which is of course what happens in real science (as opposed to in maths and logic and string theory) and is what the scientific method is (or perhaps used to be) all about.
And sometimes I want to throw some software together in order to get some measurements in order to formulate some hypotheses - experimentation to allow hypotheses to be formulated is just as much part of the scientific method as is experiment to test, and is clearly applicable in the computing world. "I did this and got these results and can't make head or tail of them" is a perfectly reasonable thing to say in a scientific paper too, and if computing and/or software engineering are science based that sort of paper must be allowed too, not just papers describing hypotheses and the experiments carried out to check them.

Well stated and true. That's what I meant when you have to reprove experiments that have been cited rather than just take their word for it. The experiment has to be setup correctly and not just for the question at hand.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

meilenb Old Hand Points: 361 More actions · Answer 3

It doesn't have to be broken to get it out the door.

That is a false choice.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 4

meilenb (5/24/2015)
It doesn't have to be broken to get it out the door.
That is a false choice.

If it doesn't meet the customers expectations, or fails to meet the customers needs, it's broken.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 5

TomThomson (5/23/2015)
But the scientific method doesn't generally lead to proofs - <large snip>

Of course, it's a fair bit more complex than what I wrote. The full scope is well beyond an editorial.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

meilenb Old Hand Points: 361 More actions · Answer 6

The definition of broken is not at issue.

The assumption something is broken is not a valid assumption just because you disagree with a point.

Funny how people need to suggest this in order to validate their argument.

Very unscientific...

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 7

meilenb (5/24/2015)
The definition of broken is not at issue.
The assumption something is broken is not a valid assumption just because you disagree with a point.
Funny how people need to suggest this in order to validate their argument.
Very unscientific...

If you fail to test your assumptions while building your application and use constructs that are not scalable or are inefficient simply to get your application out the door, then you are releasing a "broken" application. It will cost you more to fix it later than if you had bothered to it right from the start.

meilenb Old Hand Points: 361 More actions · Answer 8

meilenb

Old Hand

Points: 361

May 24, 2015 at 3:50 pm

#1800295

True,

But nobody is advocating that.

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 9

meilenb (5/24/2015)
It doesn't have to be broken to get it out the door.
That is a false choice.

The problem is that it frequently is broken but no one knows because they elected to ship rather than test. This is especially true when it comes to scalability and concurrency.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

meilenb Old Hand Points: 361 More actions · Answer 10

meilenb

Old Hand

Points: 361

May 24, 2015 at 4:06 pm

#1800297

Again true

And again, nobody is advocating that

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 11

meilenb (5/24/2015)
Again true
And again, nobody is advocating that

Not correct. It's certainly a part of what the article covered... and advocated.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

jurgen.lottermoser Valued Member Points: 70 More actions · Answer 12

Ideally, we'd put the same values in both columns. However, that would limit the test data set to 9999 rows, as '10000' doesn't fit into a char(4).

wouldn't having the same values in both columns in this case mean turning the ints into strings rather than having the same number in both of them? i.e. so that they actually contain the same binary data? e.g. 10,000 is 0010011100010000, which is "'?" where the ? is character 16, which in ascii is a non-printing control character.

in which case, why would joining on ints be faster than joining on strings? is it because depending on the encoding of the string it can't just be treated as a number? but if the encoding is the same then surely they can? or is it that some values cause problems for some encodings, e.g. the first 32 values of ascii encoding.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 13

jurgen.lottermoser (5/24/2015)
wouldn't having the same values in both columns in this case mean turning the ints into strings rather than having the same number in both of them?

You could certainly test that way, though you'll have to be careful to keep the column sizes the same. Let me know what differences in performance you see when tested that way. 🙂

To be honest, I wasn't trying to keep the values exactly the same, I wanted to keep the size the same and the easiest way to populate the two tables was to use the same data generation and just CAST the int to string. Maybe it was a flawed methodology, feel free to re-test your way and either refute or confirm my results (either is good)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

meilenb Old Hand Points: 361 More actions · Answer 14

No one is advocating shipping without testing so your comment on that matter is not correct.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 15

meilenb (5/24/2015)
No one is advocating shipping without testing so your comment on that matter is not correct.

From your statement here (emphasis mine):

ROI = (Gain from Investment - Cost of Investment) / Cost of Investment
Yes, I hear the "to be" replies (It costs less to catch a flaw up front than to fix it later, etc.). But if you don't get the product out the door you will never have any money to fix it later. Successful people don't wait around while scientists argue the merits of their experiments.