Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plan: maintain HistColl in Selection and Projection's statsInfo. #7595

Closed
wants to merge 10 commits into from

Conversation

winoros
Copy link
Member

@winoros winoros commented Sep 3, 2018

What problem does this PR solve?

Maintain the HistColl in Selection and Projection.

Note: Currently maintain it or not is controlled by tidb_optimizer_selectivity_level.

What is changed and how it works?

For selection it only needs to assign the HistColl of its child.(Currently we don't handle the bound value of the Histograms)

For projection, build new map.

Check List

Tests

  • Unit test

plan/stats.go Outdated
@@ -47,7 +47,6 @@ func (s *statsInfo) scale(factor float64) *statsInfo {
profile := &statsInfo{
count: s.count * factor,
cardinality: make([]float64, len(s.cardinality)),
histColl: s.histColl,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scale is called by scaleByExpectedCnt which is called many many times when building physical plan.
It's hard to handle expectedCnt with stats estimation by Histogram. So I think in such case HistColl is not need to maintained. Thus maintain the HistColl out of this method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since it's the problem of scaleByExpecteCnt, why not do it in that function?

Copy link
Member Author

@winoros winoros Sep 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emm.. i need some clearer explain?

@winoros winoros added the sig/planner SIG: Planner label Sep 3, 2018
@@ -293,7 +293,7 @@ func ValidateSetSystemVar(vars *SessionVars, name string, value string) (string,
TiDBDistSQLScanConcurrency,
TiDBIndexSerialScanConcurrency, TiDBDDLReorgWorkerCount,
TiDBBackoffLockFast, TiDBMaxChunkSize,
TiDBDMLBatchSize, TiDBOptimizerSelectivityLevel:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation is not right. The default value of it is zero. But here just checks what it should not be zero.

plan/cbo_test.go Outdated
@@ -690,6 +690,28 @@ func (s *testAnalyzeSuite) TestInconsistentEstimation(c *C) {
))
}

func (s *testAnalyzeSuite) TestHistogramInPlan(c *C) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments in the test case.

plan/stats.go Outdated
return p.stats, nil
}

func (p *LogicalProjection) deriveHistStats(childProfile *statsInfo) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments in the function.

plan/stats.go Outdated
@@ -47,7 +47,6 @@ func (s *statsInfo) scale(factor float64) *statsInfo {
profile := &statsInfo{
count: s.count * factor,
cardinality: make([]float64, len(s.cardinality)),
histColl: s.histColl,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since it's the problem of scaleByExpecteCnt, why not do it in that function?

plan/stats.go Outdated
@@ -174,7 +175,23 @@ func (p *LogicalSelection) deriveStats() (*statsInfo, error) {
if err != nil {
return nil, errors.Trace(err)
}
p.stats = childProfile.scale(selectionFactor)
factor := selectionFactor
if p.ctx.GetSessionVars().OptimizerSelectivityLevel >= 1 && childProfile.histColl != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does OptimizerSelectivityLevel >= 1 mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just its literal meaning?
This variable controls how we maintain stats to calculate selectivity. The higher it is, the more powerful way we use. Currently, it should be 0 or 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to support OptimizerSelectivityLevel=2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, when it's 1, we only use the DataSource's Histogram and won't create newly one. When it's 2, we'll create new Histogram during planning.

Copy link
Member

@zz-jason zz-jason Sep 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about making some constant values to represent different levels, for example:

const (
	SLPseudo int = 0
	SLStatsDataSourceOnly = 1
	SLStatsAll = 2
)

plan/stats.go Outdated
@@ -174,7 +175,23 @@ func (p *LogicalSelection) deriveStats() (*statsInfo, error) {
if err != nil {
return nil, errors.Trace(err)
}
p.stats = childProfile.scale(selectionFactor)
factor := selectionFactor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about s/factor/selecity/?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Em, selecity is not a word, you mean selectivity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah.

plan/stats.go Outdated
@@ -110,11 +109,10 @@ func (p *baseLogicalPlan) deriveStats() (*statsInfo, error) {
return profile, nil
}

func (ds *DataSource) getStatsByFilter(conds expression.CNFExprs) *statsInfo {
profile := &statsInfo{
func (ds *DataSource) getStatsByFilter(conds expression.CNFExprs) (profile *statsInfo, histColl *statistics.HistColl) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that all the code changes related to the functions of DataSource is not irrelevant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can extract the same part with selection.deriveStats

@winoros
Copy link
Member Author

winoros commented Sep 26, 2018

Em.. Some unresolved comments are hidden after merged master...

@winoros
Copy link
Member Author

winoros commented Sep 27, 2018

@zz-jason

how about making some constant values to represent different levels, for example:
const (
SLPseudo int = 0
SLStatsDataSourceOnly = 1
SLStatsAll = 2
)

I found that no other variable creates const like this. Could we add one for this variable?

@winoros
Copy link
Member Author

winoros commented Oct 10, 2018

Will re-push it again later

@winoros winoros closed this Oct 10, 2018
@winoros winoros deleted the hist-proj-select branch March 15, 2019 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants