Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-27029 Start/stop ELK by default via docker scripts #15718

Conversation

rpastrana
Copy link
Member

@rpastrana rpastrana commented Feb 1, 2022

  • Starts Elastic Stack instance by default via startall.sh
  • Stops Elastic Stack instance by default via stopall.sh
  • Performs dependancy check on 3rd party charts on start
  • Documents legacy option related to ELK

Signed-off-by: Rodrigo Pastrana [email protected]

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

@github-actions
Copy link

github-actions bot commented Feb 1, 2022

@rpastrana
Copy link
Member Author

@richardkchapman this version will perform a helm dependency check on the elastic4hpcclogs charts on every start.
Which does have a (negligible?) added overhead.
The other option I considered was keeping the -c option in place for elastic4hpcclogs.
The users would be explicitly prompted to use the -c if the dependencies are absent, and would only need to do it once.

@@ -87,7 +88,7 @@ while [ "$#" -gt 0 ]; do
echo " -c Update chart dependencies"
echo " -p <location> Use local persistent data"
echo " -pv <yamlfile> Override dataplane definitions for local persistent data"
echo " -e Deploy light-weight Elastic Stack for component log processing"
echo " -e Ignored - light-weight Elastic Stack deployed by default"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be sensible to repurpose -e to suppress the deployment of elk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would definitely be useful, I was concerned about overwriting the option w/ the complete opposite behavior, but I doubt anybody would be affected by that change. I'll repurpose to suppress.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question on the uninstall?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stopall attempts to uninstall the elastic cluster regardless, no need to include the option.
That's based on precedent behavior encouraged by Jake/Gavin in other PRs related to stopall.

@rpastrana rpastrana force-pushed the HPCC-27029-start-elk-bydefault branch 2 times, most recently from 418aaa8 to cf49b40 Compare February 2, 2022 17:51
@richardkchapman
Copy link
Member

richardkchapman commented Feb 3, 2022

I got the following warnings (errors) when I ran startall without params:

Deploying myelastic4hpcclogs - light-weight Elastic Stack:
NAME VERSION REPOSITORY STATUS
filebeat 7.16.3 https://helm.elastic.co wrong version
elasticsearch 7.16.3 https://helm.elastic.co wrong version
kibana 7.16.3 https://helm.elastic.co wrong version

W0203 09:53:25.825499 93370 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0203 09:53:25.831727 93370 warnings.go:70] rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole
W0203 09:53:25.833380 93370 warnings.go:70] rbac.authorization.k8s.io/v1beta1 ClusterRoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRoleBinding
W0203 09:53:25.863538 93370 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0203 09:53:25.876382 93370 warnings.go:70] rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole
W0203 09:53:25.880616 93370 warnings.go:70] rbac.authorization.k8s.io/v1beta1 ClusterRoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRoleBinding

Issue reported to elastic: elastic/helm-charts#1568

@richardkchapman
Copy link
Member

System started ok, was able to connect a browser to kibana at localhost:5601. But no idea what to do with it - should we prepopulate a standard dashboard, have it already pointed at our elasticsearch?

@richardkchapman
Copy link
Member

I'm not convinced that the default behaviour of "leave the pvc after running stopall.sh" is desirable. Default behaviour of startall/stopall in all other respects is "start from clean, and leave no trace".

@rpastrana
Copy link
Member Author

I'm not convinced that the default behaviour of "leave the pvc after running stopall.sh" is desirable. Default behaviour of startall/stopall in all other respects is "start from clean, and leave no trace".

Fair enough... we can add logic to ensure the Elastic PVC is deleted upon stopall...
I should be able to track, or at least deduce the Elastic PVC name, if we feel removing all PVCs is too intrusive.
Have we considered assigning a namespace in the start-all?

@rpastrana
Copy link
Member Author

rpastrana commented Feb 3, 2022

filebeat 7.16.3 https://helm.elastic.co wrong version
elasticsearch 7.16.3 https://helm.elastic.co wrong version
kibana 7.16.3 https://helm.elastic.co wrong version

The "wrong version" status means Helm found the target charts locally, but they were apparently outdated (not surprised since we recently bumped the ELK charts). We hard-coded a dependency update in startall and it should have pulled the latest charts, Out of curiosity does helm list report the following?
$ helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
...
myelastic4hpcclogs default 1 2022-02-03 09:08:56.5748994 -0500 EST deployed elastic4hpcclogs-1.2.2 7.16.3

The k8s warnings have to be addressed, I've reported a few of them on the Elastic bug tracker but your output shows some I haven't seen before. Do you mind sharing the Helm/k8s/docker version you're on? I'll use that info to report issue.

Pre-existing issue: elastic/helm-charts#1055

@rpastrana
Copy link
Member Author

rpastrana commented Feb 3, 2022

System started ok, was able to connect a browser to kibana at localhost:5601. But no idea what to do with it - should we prepopulate a standard dashboard, have it already pointed at our elasticsearch?

Absolutely, I'm targeting the discover view with the hpcc log centric fields laid out by default as the defaultroute in Kibana

https://track.hpccsystems.com/browse/HPCC-27127

image

@richardkchapman
Copy link
Member

We could (optionally) use a namespace, but I suspect a lot of developers have got used to there not being one and may have scripts that would fail. Would be better to deduce the pvc name I think

if [[ -z "${DEP_UPDATE_ARG}" ]]; then
dependency_check "managed/logging/elastic"
fi
dependency_check "managed/logging/elastic"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, this change is wrong and should be reverted

@rpastrana rpastrana force-pushed the HPCC-27029-start-elk-bydefault branch 2 times, most recently from c62eab7 to 6d237ea Compare February 4, 2022 20:18
@rpastrana
Copy link
Member Author

rpastrana commented Feb 4, 2022

@richardkchapman I've reversed the dep-check change.
As mentioned offline, it appears the helm install --dependency-update doesn't work exactly like the helm dependency update command.
It seems the former doesn't update outdated charts:

Helm dependency update doc:
https://helm.sh/docs/helm/helm_dependency_update/#synopsis

Helm install options doc:
https://helm.sh/docs/helm/helm_install/#options

https://track.hpccsystems.com/browse/HPCC-27135

@richardkchapman
Copy link
Member

@rpastrana Looks reasonable - can you squash

- Starts Elastic Stack instance by default via startall.sh
- Stops Elastic Stack instance by default via stopall.sh
- Reverses -e option to supress deployment of elk
- Adds logic to delete elastic PVC/PV
- Defines option to suppress deletion of elastic4hpcclogs

Signed-off-by: Rodrigo Pastrana <[email protected]>
@rpastrana rpastrana force-pushed the HPCC-27029-start-elk-bydefault branch from 5fbbe1a to e20e64c Compare February 8, 2022 16:42
@rpastrana
Copy link
Member Author

@richardkchapman squashed

@HPCCSmoketest
Copy link
Contributor

Automated Smoketest: ✅
OS: centos 7.6.1810 (Linux 3.10.0-957.1.3.el7.x86_64)
Host: ip-10-20-0-168.ca-central-1.compute.internal
GCC: gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Git: 2.9.5, CMake: 3.22.1, cUrl: 7.67.0, node.js: v16.13.1, npm: 8.1.2
Sha: e20e64c
Containerized:True
Build: success
Milestone:Install hpccsystems-platform-community_8.6.0-closedown0.el7.x86_64.rpm
Time stats:

Prep time Build time Package time Install time Start time Test time Stop time Summary
14 sec (00:00:14) 400 sec (00:06:40) 104 sec (00:01:44) 21 sec (00:00:21) 0 sec (00:00:00) 0 sec (00:00:00) 0 sec (00:00:00) 539 sec (00:08:59)

@richardkchapman richardkchapman merged commit 172fb9c into hpcc-systems:candidate-8.6.x Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants