OKD / OCP 4.14

Signed-off-by: Arthur <arthur@arthurvardevanyan.com>
Update README.md
2023-10-31 21:59:14 +08:00 · 2023-10-26 22:54:29 +08:00 · 2023-10-26 22:53:44 +08:00 · 2023-10-23 16:15:00 +08:00 · 2023-10-20 00:01:51 +08:00 · 2023-10-19 22:57:55 +08:00
152 changed files with 12689 additions and 1316 deletions
--- a/.codespellignore
+++ b/.codespellignore
@ -0,0 +1,5 @@
 aks
 ec2
 eks
 gce
 gcp
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@ -0,0 +1 @@
 * @longhorn/dev
--- a/.github/ISSUE_TEMPLATE/bug.md
+++ b/.github/ISSUE_TEMPLATE/bug.md
@ -0,0 +1,48 @@
 ---
 name: Bug report
 about: Create a bug report
 title: "[BUG]"
 labels: ["kind/bug", "require/qa-review-coverage", "require/backport"]
 assignees: ''
 ---
 ## Describe the bug (🐛 if you encounter this issue)
 <!--A clear and concise description of what the bug is.-->
 ## To Reproduce
 <!--Provide the steps to reproduce the behavior.-->
 ## Expected behavior
 <!--A clear and concise description of what you expected to happen.-->
 ## Support bundle for troubleshooting
 <!--Provide a support bundle when the issue happens. You can generate a support bundle using the link at the footer of the Longhorn UI. Check [here](https://longhorn.io/docs/latest/advanced-resources/support-bundle/).-->
 ## Environment
 <!-- Suggest checking the doc of the best practices of using Longhorn. [here](https://longhorn.io/docs/1.5.1/best-practices)-->
 - Longhorn version:
 - Installation method (e.g. Rancher Catalog App/Helm/Kubectl):
 - Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version:
   - Number of management node in the cluster:
   - Number of worker node in the cluster:
 - Node config
   - OS type and version:
   - Kernel version:
   - CPU per node:
   - Memory per node:
   - Disk type(e.g. SSD/NVMe/HDD):
   - Network bandwidth between the nodes:
 - Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal):
 - Number of Longhorn volumes in the cluster:
 - Impacted Longhorn resources:
   - Volume names:
 ## Additional context
 <!--Add any other context about the problem here.-->
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -1,49 +0,0 @@
 ---
 name: Bug report
 about: Create a report to help us improve
 title: "[BUG]"
 labels: kind/bug
 assignees: ''
 ---
 ## Describe the bug
 A clear and concise description of what the bug is.
 ## To Reproduce
 Steps to reproduce the behavior:
 1. Go to '...'
 2. Click on '....'
 3. Perform '....'
 4. See error
 ## Expected behavior
 A clear and concise description of what you expected to happen.
 ## Log or Support bundle
 If applicable, add the Longhorn managers' log or support bundle when the issue happens. 
 You can generate a Support Bundle using the link at the footer of the Longhorn UI.
 ## Environment
 - Longhorn version:
 - Installation method (e.g. Rancher Catalog App/Helm/Kubectl):
 - Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version:
   - Number of management node in the cluster:
   - Number of worker node in the cluster:
 - Node config
   - OS type and version:
   - CPU per node:
   - Memory per node:
   - Disk type(e.g. SSD/NVMe):
   - Network bandwidth between the nodes:
 - Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal):
 - Number of Longhorn volumes in the cluster:
 ## Additional context
 Add any other context about the problem here.
--- a/.github/ISSUE_TEMPLATE/doc.md
+++ b/.github/ISSUE_TEMPLATE/doc.md
@ -0,0 +1,16 @@
 ---
 name: Document
 about: Create or update document
 title: "[DOC] "
 labels: kind/doc
 assignees: ''
 ---
 ## What's the document you plan to update? Why? Please describe
 <!--A clear and concise description of what the document is.-->
 ## Additional context
 <!--Add any other context or screenshots about the document request here.-->
--- a/.github/ISSUE_TEMPLATE/feature.md
+++ b/.github/ISSUE_TEMPLATE/feature.md
@ -0,0 +1,24 @@
 ---
 name: Feature request
 about: Suggest an idea/feature
 title: "[FEATURE] "
 labels: ["kind/enhancement", "require/lep", "require/doc", "require/auto-e2e-test"]
 assignees: ''
 ---
 ## Is your feature request related to a problem? Please describe (👍 if you like this request)
 <!--A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]-->
 ## Describe the solution you'd like
 <!--A clear and concise description of what you want to happen-->
 ## Describe alternatives you've considered
 <!--A clear and concise description of any alternative solutions or features you've considered.-->
 ## Additional context
 <!--Add any other context or screenshots about the feature request here.-->
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -1,24 +0,0 @@
 ---
 name: Feature request
 about: Suggest an idea for this project
 title: "[FEATURE] "
 labels: kind/enhancement
 assignees: ''
 ---
 ## Is your feature request related to a problem? Please describe
 A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
 ## Describe the solution you'd like
 A clear and concise description of what you want to happen
 ## Describe alternatives you've considered
 A clear and concise description of any alternative solutions or features you've considered.
 ## Additional context
 Add any other context or screenshots about the feature request here.
--- a/.github/ISSUE_TEMPLATE/improvement.md
+++ b/.github/ISSUE_TEMPLATE/improvement.md
@ -0,0 +1,24 @@
 ---
 name: Improvement request
 about: Suggest an improvement of an existing feature
 title: "[IMPROVEMENT] "
 labels: ["kind/improvement", "require/doc", "require/auto-e2e-test", "require/backport"]
 assignees: ''
 ---
 ## Is your improvement request related to a feature? Please describe (👍 if you like this request)
 <!--A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]-->
 ## Describe the solution you'd like
 <!--A clear and concise description of what you want to happen.-->
 ## Describe alternatives you've considered
 <!--A clear and concise description of any alternative solutions or features you've considered.-->
 ## Additional context
 <!--Add any other context or screenshots about the feature request here.-->
--- a/.github/ISSUE_TEMPLATE/improvement_request.md
+++ b/.github/ISSUE_TEMPLATE/improvement_request.md
@ -1,24 +0,0 @@
 ---
 name: Improvement request
 about: Suggest an improvement of an existing feature for this project
 title: "[IMPROVEMENT] "
 labels: kind/improvement
 assignees: ''
 ---
 ## Is your improvement request related to a feature? Please describe
 A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
 ## Describe the solution you'd like
 A clear and concise description of what you want to happen.
 ## Describe alternatives you've considered
 A clear and concise description of any alternative solutions or features you've considered.
 ## Additional context
 Add any other context or screenshots about the feature request here.
--- a/.github/ISSUE_TEMPLATE/infra.md
+++ b/.github/ISSUE_TEMPLATE/infra.md
@ -0,0 +1,24 @@
 ---
 name: Infra
 about: Create an test/dev infra task
 title: "[INFRA] "
 labels: kind/infra
 assignees: ''
 ---
 ## What's the test to develop? Please describe
 <!--A clear and concise description of what test/dev infra you want to develop.-->
 ## Describe the items of the test development (DoD, definition of done) you'd like
 <!--
 Please use a task list for items on a separate line with a clickable checkbox https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists
 - [ ] `item 1`
 -->
 ## Additional context
 <!--Add any other context or screenshots about the test infra request here.-->
--- a/.github/ISSUE_TEMPLATE/question.md
+++ b/.github/ISSUE_TEMPLATE/question.md
@ -1,13 +1,14 @@
 ---
 name: Question
-about: Question on Longhorn
+about: Have a question
 title: "[QUESTION] "
 labels: kind/question
 assignees: ''
 ---
 ## Question
-> Suggest to use https://github.com/longhorn/longhorn/discussions to ask questions.
+
 <!--Suggest to use https://github.com/longhorn/longhorn/discussions to ask questions.-->
 ## Environment
@ -15,6 +16,7 @@ assignees: ''
 - Kubernetes version:
 - Node config
   - OS type and version
   - Kernel version
   - CPU per node:
   - Memory per node:
   - Disk type
@ -23,4 +25,4 @@ assignees: ''
 ## Additional context
-Add any other context about the problem here.
+<!--Add any other context about the problem here.-->
--- a/.github/ISSUE_TEMPLATE/refactor.md
+++ b/.github/ISSUE_TEMPLATE/refactor.md
@ -0,0 +1,24 @@
 ---
 name: Refactor request
 about: Suggest a refactoring request for an existing implementation
 title: "[REFACTOR] "
 labels: kind/refactoring
 assignees: ''
 ---
 ## Is your improvement request related to a feature? Please describe
 <!--A clear and concise description of what the problem is.-->
 ## Describe the solution you'd like
 <!--A clear and concise description of what you want to happen.-->
 ## Describe alternatives you've considered
 <!--A clear and concise description of any alternative solutions or features you've considered.-->
 ## Additional context
 <!--Add any other context or screenshots about the refactoring request here.-->
--- a/.github/ISSUE_TEMPLATE/release.md
+++ b/.github/ISSUE_TEMPLATE/release.md
@ -0,0 +1,35 @@
 ---
 name: Release task
 about: Create a release task
 title: "[RELEASE]"
 labels: release/task
 assignees: ''
 ---
 **What's the task? Please describe.**
 Action items for releasing v<x.y.z>
 **Describe the sub-tasks.**
  - Pre-Release
    - [ ] Regression test plan (manual) - @khushboo-rancher 
    - [ ] Run e2e regression for pre-GA milestones (`install`, `upgrade`) - @yangchiu 
    - [ ] Run security testing of container images for pre-GA milestones - @yangchiu 
    - [ ] Verify longhorn chart PR to ensure all artifacts are ready for GA (`install`, `upgrade`)  @chriscchien 
    - [ ] Run core testing (install, upgrade) for the GA build from the previous patch and the last patch of the previous feature release (1.4.2). - @yangchiu 
  - Release
    - [ ] Release longhorn/chart from the release branch to publish to ArtifactHub
    - [ ] Release note
 	     - [ ] Deprecation note
 	     - [ ] Upgrade notes including highlighted notes, deprecation, compatible changes, and others impacting the current users
  - Post-Release
    - [ ] Create a new release branch of manager/ui/tests/engine/longhorn instance-manager/share-manager/backing-image-manager when creating the RC1
    - [ ] Update https://github.com/longhorn/longhorn/blob/master/deploy/upgrade_responder_server/chart-values.yaml @PhanLe1010 
    - [ ] Add another request for the rancher charts for the next patch release (`1.5.1`) @rebeccazzzz  
  - Rancher charts: verify the chart is able to install & upgrade - @khushboo-rancher 
    - [ ] rancher/image-mirrors update @weizhe0422 (@PhanLe1010 )
        - https://github.com/rancher/image-mirror/pull/412
    - [ ] rancher/charts 2.7 branches for rancher marketplace @weizhe0422 (@PhanLe1010)
        - `dev-2.7`: https://github.com/rancher/charts/pull/2766
 cc @longhorn/qa @longhorn/dev 
--- a/.github/ISSUE_TEMPLATE/task.md
+++ b/.github/ISSUE_TEMPLATE/task.md
@ -1,6 +1,6 @@
 ---
 name: Task
-about: Task on Longhorn
+about: Create a general task
 title: "[TASK] "
 labels: kind/task
 assignees: ''
@ -9,13 +9,16 @@ assignees: ''
 ## What's the task? Please describe
-A clear and concise description of what the task is.
+<!--A clear and concise description of what the task is.-->
-## Describe the items of the task (DoD, definition of done) you'd like
+## Describe the sub-tasks
-> Please use a task list for items on a separate line with a clickable checkbox https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists
+
 <!--
 Please use a task list for items on a separate line with a clickable checkbox https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists
 - [ ] `item 1`
 -->
 ## Additional context
-Add any other context or screenshots about the task request here.
+<!--Add any other context or screenshots about the task request here.-->
--- a/.github/ISSUE_TEMPLATE/test.md
+++ b/.github/ISSUE_TEMPLATE/test.md
@ -1,6 +1,6 @@
 ---
 name: Test
-about: Test task on Longhorn
+about: Create or update test
 title: "[TEST] "
 labels: kind/test
 assignees: ''
@ -9,13 +9,16 @@ assignees: ''
 ## What's the test to develop? Please describe
-A clear and concise description of what the test you want to develop.
+<!--A clear and concise description of what test you want to develop.-->
-## Describe the items of the test development (DoD, definition of done) you'd like
+## Describe the tasks for the test
-> Please use a task list for items on a separate line with a clickable checkbox https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists
+
 <!--
 Please use a task list for items on a separate line with a clickable checkbox https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists
 - [ ] `item 1`
 -->
 ## Additional context
-Add any other context or screenshots about the test request here.
+<!--Add any other context or screenshots about the test request here.-->
--- a/.github/mergify.yml
+++ b/.github/mergify.yml
@ -0,0 +1,34 @@
 pull_request_rules:
 - name: automatic merge after review
  conditions:
  - check-success=continuous-integration/drone/pr
  - check-success=DCO
  - check-success=CodeFactor
  - check-success=codespell
  - "#approved-reviews-by>=1"
  - approved-reviews-by=@longhorn/maintainer
  - label=ready-to-merge
  actions:
    merge:
      method: rebase
 - name: ask to resolve conflict
  conditions:
  - conflict
  actions:
    comment:
      message: This pull request is now in conflicts. Could you fix it @{{author}}? 🙏
 # Comment on the PR to trigger backport. ex: @Mergifyio copy stable/3.1 stable/4.0
 - name: backport patches to stable branch
  conditions:
  - base=master
  actions:
    backport:
      title: "[BACKPORT][{{ destination_branch }}] {{ title }}"
      body: |
        This is an automatic backport of pull request #{{number}}.
        {{cherry_pick_error}}
      assignees:
      - "{{ author }}"
--- a/.github/workflows/add-to-projects.yml
+++ b/.github/workflows/add-to-projects.yml
@ -0,0 +1,40 @@
 name: Add-To-Projects
 on:
  issues:
    types: [ opened, labeled ]
 jobs:
  community:
    runs-on: ubuntu-latest
    steps:
    - name: Is Longhorn Member
      uses: tspascoal/get-user-teams-membership@v1.0.4
      id: is-longhorn-member
      with:
        username: ${{ github.event.issue.user.login }}
        organization: longhorn
        GITHUB_TOKEN: ${{ secrets.CUSTOM_GITHUB_TOKEN }}
    - name: Add To Community Project
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] == null
      uses: actions/add-to-project@v0.3.0
      with:
        project-url: https://github.com/orgs/longhorn/projects/5
        github-token: ${{ secrets.CUSTOM_GITHUB_TOKEN }}
  qa:
    runs-on: ubuntu-latest
    steps:
    - name: Is Longhorn Member
      uses: tspascoal/get-user-teams-membership@v1.0.4
      id: is-longhorn-member
      with:
        username: ${{ github.event.issue.user.login }}
        organization: longhorn
        GITHUB_TOKEN: ${{ secrets.CUSTOM_GITHUB_TOKEN }}
    - name: Add To QA & DevOps Project
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null
      uses: actions/add-to-project@v0.3.0
      with:
        project-url: https://github.com/orgs/longhorn/projects/4
        github-token: ${{ secrets.CUSTOM_GITHUB_TOKEN }}
        labeled: kind/test, area/infra
        label-operator: OR
--- a/.github/workflows/close-issue.yml
+++ b/.github/workflows/close-issue.yml
@ -0,0 +1,50 @@
 name: Close-Issue
 on:
  issues:
    types: [ unlabeled ]
 jobs:
  backport:
    runs-on: ubuntu-latest
    if: contains(github.event.label.name, 'backport/')
    steps:
    - name: Get Backport Version
      uses: xom9ikk/split@v1
      id: split
      with:
        string: ${{ github.event.label.name }}
        separator: /
    - name: Check if Backport Issue Exists
      uses: actions-cool/issues-helper@v3
      id: if-backport-issue-exists
      with:
        actions: 'find-issues'
        token: ${{ github.token }}
        title-includes: |
          [BACKPORT][v${{ steps.split.outputs._1 }}]${{ github.event.issue.title }}
    - name: Close Backport Issue
      if: fromJSON(steps.if-backport-issue-exists.outputs.issues)[0] != null
      uses: actions-cool/issues-helper@v3
      with:
        actions: 'close-issue'
        token: ${{ github.token }}
        issue-number: ${{ fromJSON(steps.if-backport-issue-exists.outputs.issues)[0].number }}
  automation:
    runs-on: ubuntu-latest
    if: contains(github.event.label.name, 'require/automation-e2e')
    steps:
    - name: Check if Automation Issue Exists
      uses: actions-cool/issues-helper@v3
      id: if-automation-issue-exists
      with:
        actions: 'find-issues'
        token: ${{ github.token }}
        title-includes: |
          [TEST]${{ github.event.issue.title }}
    - name: Close Automation Test Issue
      if: fromJSON(steps.if-automation-issue-exists.outputs.issues)[0] != null
      uses: actions-cool/issues-helper@v3
      with:
        actions: 'close-issue'
        token: ${{ github.token }}
        issue-number: ${{ fromJSON(steps.if-automation-issue-exists.outputs.issues)[0].number }}
--- a/.github/workflows/codespell.yml
+++ b/.github/workflows/codespell.yml
@ -0,0 +1,23 @@
 name: Codespell
 on:
  push:
  pull_request:
    branches:
    - master
    - "v*.*.*"
 jobs:
  codespell:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v3
      with:
        fetch-depth: 1
    - name: Check code spell
      uses: codespell-project/actions-codespell@v1
      with:
        check_filenames: true
        ignore_words_file: .codespellignore
        skip: "*/**.yaml,*/**.yml,*/**.tpl,./deploy,./dev,./scripts,./uninstall"
--- a/.github/workflows/create-issue.yml
+++ b/.github/workflows/create-issue.yml
@ -0,0 +1,114 @@
 name: Create-Issue
 on:
  issues:
    types: [ labeled ]
 jobs:
  backport:
    runs-on: ubuntu-latest
    if: contains(github.event.label.name, 'backport/')
    steps:
    - name: Is Longhorn Member
      uses: tspascoal/get-user-teams-membership@v1.0.4
      id: is-longhorn-member
      with:
        username: ${{ github.actor }}
        organization: longhorn
        GITHUB_TOKEN: ${{ secrets.CUSTOM_GITHUB_TOKEN }}
    - name: Get Backport Version
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null
      uses: xom9ikk/split@v1
      id: split
      with:
        string: ${{ github.event.label.name }}
        separator: /
    - name: Check if Backport Issue Exists
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null
      uses: actions-cool/issues-helper@v3
      id: if-backport-issue-exists
      with:
        actions: 'find-issues'
        token: ${{ github.token }}
        issue-state: 'all'
        title-includes: |
          [BACKPORT][v${{ steps.split.outputs._1 }}]${{ github.event.issue.title }}
    - name: Get Milestone Object
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null && fromJSON(steps.if-backport-issue-exists.outputs.issues)[0] == null
      uses: longhorn/bot/milestone-action@master
      id: milestone
      with:
        token: ${{ github.token }}
        repository: ${{ github.repository }}
        milestone_name: v${{ steps.split.outputs._1 }}
    - name: Get Labels
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null && fromJSON(steps.if-backport-issue-exists.outputs.issues)[0] == null
      id: labels
      run: |
        RAW_LABELS="${{ join(github.event.issue.labels.*.name, ' ') }}"
        RAW_LABELS="${RAW_LABELS} kind/backport"
        echo "RAW LABELS: $RAW_LABELS"
        LABELS=$(echo "$RAW_LABELS" | sed -r 's/\s*backport\S+//g' | sed -r 's/\s*require\/auto-e2e-test//g' | xargs | sed 's/ /, /g')
        echo "LABELS: $LABELS"
        echo "labels=$LABELS" >> $GITHUB_OUTPUT
    - name: Create Backport Issue
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null && fromJSON(steps.if-backport-issue-exists.outputs.issues)[0] == null
      uses: dacbd/create-issue-action@v1
      id: new-issue
      with:
        token: ${{ github.token }}
        title: |
          [BACKPORT][v${{ steps.split.outputs._1 }}]${{ github.event.issue.title }}
        body: |
          backport ${{ github.event.issue.html_url }}
        labels: ${{ steps.labels.outputs.labels }}
        milestone: ${{ fromJSON(steps.milestone.outputs.data).number }}
        assignees: ${{ join(github.event.issue.assignees.*.login, ', ') }}
    - name: Get Repo Id
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null && fromJSON(steps.if-backport-issue-exists.outputs.issues)[0] == null
      uses: octokit/request-action@v2.x
      id: repo
      with:
        route: GET /repos/${{ github.repository }}
      env:
        GITHUB_TOKEN: ${{ github.token }}
    - name: Add Backport Issue To Release
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null && fromJSON(steps.if-backport-issue-exists.outputs.issues)[0] == null
      uses: longhorn/bot/add-zenhub-release-action@master
      with:
        zenhub_token: ${{ secrets.ZENHUB_TOKEN }}
        repo_id: ${{ fromJSON(steps.repo.outputs.data).id }}
        issue_number: ${{ steps.new-issue.outputs.number }}
        release_name: ${{ steps.split.outputs._1 }}
  automation:
    runs-on: ubuntu-latest
    if: contains(github.event.label.name, 'require/auto-e2e-test')
    steps:
    - name: Is Longhorn Member
      uses: tspascoal/get-user-teams-membership@v1.0.4
      id: is-longhorn-member
      with:
        username: ${{ github.actor }}
        organization: longhorn
        GITHUB_TOKEN: ${{ secrets.CUSTOM_GITHUB_TOKEN }}
    - name: Check if Automation Issue Exists
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null
      uses: actions-cool/issues-helper@v3
      id: if-automation-issue-exists
      with:
        actions: 'find-issues'
        token: ${{ github.token }}
        issue-state: 'all'
        title-includes: |
          [TEST]${{ github.event.issue.title }}
    - name: Create Automation Test Issue
      if: fromJSON(steps.is-longhorn-member.outputs.teams)[0] != null && fromJSON(steps.if-automation-issue-exists.outputs.issues)[0] == null
      uses: dacbd/create-issue-action@v1
      with:
        token: ${{ github.token }}
        title: |
          [TEST]${{ github.event.issue.title }}
        body: |
          adding/updating auto e2e test cases for ${{ github.event.issue.html_url }} if they can be automated
          cc @longhorn/qa
        labels: kind/test
--- a/.github/workflows/stale.yaml
+++ b/.github/workflows/stale.yaml
@ -0,0 +1,28 @@
 name: 'Close stale issues and PRs'
 on:
  workflow_call:
  workflow_dispatch:
  schedule:
  - cron: '30 1 * * *'
 jobs:
  stale:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/stale@v4
      with:
        stale-issue-message: 'This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.'
        stale-pr-message: 'This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.'
        close-issue-message: 'This issue was closed because it has been stalled for 5 days with no activity.'
        close-pr-message: 'This PR was closed because it has been stalled for 10 days with no activity.'
        days-before-stale: 30
        days-before-pr-stale: 45
        days-before-close: 5
        days-before-pr-close: 10
        stale-issue-label: 'stale'
        stale-pr-label: 'stale'
        exempt-all-assignees: true
        exempt-issue-labels: 'kind/bug,kind/doc,kind/enhancement,kind/poc,kind/refactoring,kind/test,kind/task,kind/backport,kind/regression,kind/evaluation'
        exempt-draft-pr: true
        exempt-all-milestones: true
--- a/.github/workflows/stale.yml
+++ b/.github/workflows/stale.yml
@ -1,27 +0,0 @@
 name: 'Close stale issues and PRs'
 on:
  workflow_call:
  workflow_dispatch:
  schedule:
    - cron: '30 1 * * *'
 jobs:
  stale:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/stale@v4
        with:
          stale-issue-message: 'This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.'
          stale-pr-message: 'This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.'
          close-issue-message: 'This issue was closed because it has been stalled for 5 days with no activity.'
          close-pr-message: 'This PR was closed because it has been stalled for 10 days with no activity.'
          days-before-stale: 30
          days-before-pr-stale: 45
          days-before-close: 5
          days-before-pr-close: 10
          stale-issue-label: 'stale'
          stale-pr-label: 'stale'
          exempt-all-assignees: true
          exempt-issue-labels: 'kind/bug,kind/doc,kind/enhancement,kind/poc,kind/refactoring,kind/test,kind/task,kind/backport,kind/regression,kind/evaluation'
          exempt-draft-pr: true
          exempt-all-milestones: true
--- a/CHANGELOG/CHANGELOG-1.4.0.md
+++ b/CHANGELOG/CHANGELOG-1.4.0.md
@ -0,0 +1,283 @@
 ## Release Note
 **v1.4.0 released!** 🎆
 This release introduces many enhancements, improvements, and bug fixes as described below about stability, performance, data integrity, troubleshooting, and so on. Please try it and feedback. Thanks for all the contributions!
 - [Kubernetes 1.25 Support](https://github.com/longhorn/longhorn/issues/4003) [[doc]](https://longhorn.io/docs/1.4.0/deploy/important-notes/#pod-security-policies-disabled--pod-security-admission-introduction)
  In the previous versions, Longhorn relies on Pod Security Policy (PSP) to authorize Longhorn components for privileged operations. From Kubernetes 1.25, PSP has been removed and replaced with Pod Security Admission (PSA). Longhorn v1.4.0 supports opt-in PSP enablement, so it can support Kubernetes versions with or without PSP.
 - [ARM64 GA](https://github.com/longhorn/longhorn/issues/4206)
  ARM64 has been experimental from Longhorn v1.1.0. After receiving more user feedback and increasing testing coverage, ARM64 distribution has been stabilized with quality as per our regular regression testing, so it is qualified for general availability.
 - [RWX GA](https://github.com/longhorn/longhorn/issues/2293) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20220727-dedicated-recovery-backend-for-rwx-volume-nfs-server.md)[[doc]](https://longhorn.io/docs/1.4.0/advanced-resources/rwx-workloads/)
  RWX has been experimental from Longhorn v1.1.0, but it lacks availability support when the Longhorn Share Manager component behind becomes unavailable. Longhorn v1.4.0 supports NFS recovery backend based on Kubernetes built-in resource, ConfigMap, for recovering NFS client connection during the fail-over period. Also, the NFS client hard mode introduction will further avoid previous potential data loss. For the detail, please check the issue and enhancement proposal.
 - [Volume Snapshot Checksum](https://github.com/longhorn/longhorn/issues/4210) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20220922-snapshot-checksum-and-bit-rot-detection.md)[[doc]](https://longhorn.io/docs/1.4.0/references/settings/#snapshot-data-integrity)
  Data integrity is a continuous effort for Longhorn. In this version, Snapshot Checksum has been introduced w/ some settings to allow users to enable or disable checksum calculation with different modes.
 - [Volume Bit-rot Protection](https://github.com/longhorn/longhorn/issues/3198) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20220922-snapshot-checksum-and-bit-rot-detection.md)[[doc]](https://longhorn.io/docs/1.4.0/references/settings/#snapshot-data-integrity)
  When enabling the Volume Snapshot Checksum feature, Longhorn will periodically calculate and check the checksums of volume snapshots, find corrupted snapshots, then fix them.
 - [Volume Replica Rebuilding Speedup](https://github.com/longhorn/longhorn/issues/4783)
  When enabling the Volume Snapshot Checksum feature, Longhorn will use the calculated snapshot checksum to avoid needless snapshot replication between nodes for improving replica rebuilding speed and resource consumption.
 - [Volume Trim](https://github.com/longhorn/longhorn/issues/836) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20221103-filesystem-trim.md)[[doc]](https://longhorn.io/docs/1.4.0/volumes-and-nodes/trim-filesystem/#trim-the-filesystem-in-a-longhorn-volume)
  Longhorn engine supports UNMAP SCSI command to reclaim space from the block volume.
 - [Online Volume Expansion](https://github.com/longhorn/longhorn/issues/1674) [[doc]](https://longhorn.io/docs/1.4.0/volumes-and-nodes/expansion)
  Longhorn engine supports optional parameters to pass size expansion requests when updating the volume frontend to support online volume expansion and resize the filesystem via CSI node driver.
 - [Local Volume via Data Locality Strict Mode](https://github.com/longhorn/longhorn/issues/3957) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20200819-keep-a-local-replica-to-engine.md)[[doc]](https://longhorn.io/docs/1.4.0/references/settings/#default-data-locality)
  Local volume is based on a new Data Locality setting, Strict Local. It will allow users to create one replica volume staying in a consistent location, and the data transfer between the volume frontend and engine will be through a local socket instead of the TCP stack to improve performance and reduce resource consumption.
 - [Volume Recurring Job Backup Restore](https://github.com/longhorn/longhorn/issues/2227) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20201002-allow-recurring-backup-detached-volumes.md)[[doc]](https://longhorn.io/docs/1.4.0/snapshots-and-backups/backup-and-restore/restore-recurring-jobs-from-a-backup/)
  Recurring jobs binding to a volume can be backed up to the remote backup target together with the volume backup metadata. They can be restored back as well for a better operation experience.
 - [Volume IO Metrics](https://github.com/longhorn/longhorn/issues/2406) [[doc]](https://longhorn.io/docs/1.4.0/monitoring/metrics/#volume)
  Longhorn enriches Volume metrics by providing real-time IO stats including IOPS, latency, and throughput of R/W IO. Users can set up a monotoning solution like Prometheus to monitor volume performance.
 - [Longhorn System Backup & Restore](https://github.com/longhorn/longhorn/issues/1455) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20220913-longhorn-system-backup-restore.md)[[doc]](https://longhorn.io/docs/1.4.0/advanced-resources/system-backup-restore/)
  Users can back up the longhorn system to the remote backup target. Afterward, it's able to restore back to an existing cluster in place or a new cluster for specific operational purposes.
 - [Support Bundle Enhancement](https://github.com/longhorn/longhorn/issues/2759) [[lep]](https://github.com/longhorn/longhorn/blob/master/enhancements/20221109-support-bundle-enhancement.md)
  Longhorn introduces a new support bundle integration based on a general [support bundle kit](https://github.com/rancher/support-bundle-kit) solution. This can help us collect more complete troubleshooting info and simulate the cluster environment.
 - [Tunable Timeout between Engine and Replica](https://github.com/longhorn/longhorn/issues/4491) [[doc]](https://longhorn.io/docs/1.4.0/references/settings/#engine-to-replica-timeout)
  In the current Longhorn versions, the default timeout between the Longhorn engine and replica is fixed without any exposed user settings. This will potentially bring some challenges for users having a low-spec infra environment. By exporting the setting configurable, it will allow users adaptively tune the stability of volume operations.
 ## Installation
 > **Please ensure your Kubernetes cluster is at least v1.21 before installing Longhorn v1.4.0.**
 Longhorn supports 3 installation ways including Rancher App Marketplace, Kubectl, and Helm. Follow the installation instructions [here](https://longhorn.io/docs/1.4.0/deploy/install/).
 ## Upgrade
 > **Please ensure your Kubernetes cluster is at least v1.21 before upgrading to Longhorn v1.4.0 from v1.3.x. Only support upgrading from 1.3.x.**
 Follow the upgrade instructions [here](https://longhorn.io/docs/1.4.0/deploy/upgrade/).
 ## Deprecation & Incompatibilities
 - Pod Security Policy is an opt-in setting. If installing Longhorn with PSP support, need to enable it first.
 - The built-in CSI Snapshotter sidecar is upgraded to v5.0.1. The v1beta1 version of Volume Snapshot custom resource is deprecated but still supported. However, it will be removed after upgrading CSI Snapshotter to 6.1 or later versions in the future, so please start using v1 version instead before the deprecated version is removed.
 ## Known Issues after Release
 Please follow up on [here](https://github.com/longhorn/longhorn/wiki/Outstanding-Known-Issues-of-Releases) about any outstanding issues found after this release.
 ## Highlights
 - [FEATURE] Reclaim/Shrink space of volume ([836](https://github.com/longhorn/longhorn/issues/836)) - @yangchiu @derekbit @smallteeths @shuo-wu
 - [FEATURE] Backup/Restore Longhorn System ([1455](https://github.com/longhorn/longhorn/issues/1455)) - @c3y1huang @khushboo-rancher
 - [FEATURE] Online volume expansion ([1674](https://github.com/longhorn/longhorn/issues/1674)) - @shuo-wu @chriscchien
 - [FEATURE] Record recurring schedule in the backups and allow user choose to use it for the restored volume ([2227](https://github.com/longhorn/longhorn/issues/2227)) - @yangchiu @mantissahz
 - [FEATURE] NFS support (RWX) GA ([2293](https://github.com/longhorn/longhorn/issues/2293)) - @derekbit @chriscchien
 - [FEATURE] Support metrics for Volume IOPS, throughput and latency real time ([2406](https://github.com/longhorn/longhorn/issues/2406)) - @derekbit @roger-ryao
 - [FEATURE] Support bundle enhancement ([2759](https://github.com/longhorn/longhorn/issues/2759)) - @c3y1huang @chriscchien
 - [FEATURE] Automatic identifying of corrupted replica (bit rot detection) ([3198](https://github.com/longhorn/longhorn/issues/3198)) - @yangchiu @derekbit
 - [FEATURE] Local volume for distributed data workloads ([3957](https://github.com/longhorn/longhorn/issues/3957)) - @derekbit @chriscchien
 - [IMPROVEMENT] Support K8s 1.25 by updating removed deprecated resource versions like PodSecurityPolicy ([4003](https://github.com/longhorn/longhorn/issues/4003)) - @PhanLe1010 @chriscchien
 - [IMPROVEMENT] Faster resync time for fresh replica rebuilding ([4092](https://github.com/longhorn/longhorn/issues/4092)) - @yangchiu @derekbit
 - [FEATURE] Introduce checksum for snapshots ([4210](https://github.com/longhorn/longhorn/issues/4210)) - @derekbit @roger-ryao
 - [FEATURE] Update K8s version support and component/pkg/build dependencies ([4239](https://github.com/longhorn/longhorn/issues/4239)) - @yangchiu @PhanLe1010
 - [BUG] data corruption due to COW and block size not being aligned during rebuilding replicas  ([4354](https://github.com/longhorn/longhorn/issues/4354)) - @PhanLe1010 @chriscchien
 - [IMPROVEMENT]  Adjust the iSCSI timeout and the engine-to-replica timeout settings ([4491](https://github.com/longhorn/longhorn/issues/4491)) - @yangchiu @derekbit
 - [IMPROVEMENT] Using specific block size in Longhorn volume's filesystem ([4594](https://github.com/longhorn/longhorn/issues/4594)) - @derekbit @roger-ryao
 - [IMPROVEMENT] Speed up replica rebuilding by the metadata such as ctime of snapshot disk files ([4783](https://github.com/longhorn/longhorn/issues/4783)) - @yangchiu @derekbit
 ## Enhancements
 - [FEATURE] Configure successfulJobsHistoryLimit of CronJobs ([1711](https://github.com/longhorn/longhorn/issues/1711)) - @weizhe0422 @chriscchien
 - [FEATURE] Allow customization of the cipher used by cryptsetup in volume encryption ([3353](https://github.com/longhorn/longhorn/issues/3353)) - @mantissahz @chriscchien
 - [FEATURE] New setting to limit the concurrent volume restoring from backup ([4558](https://github.com/longhorn/longhorn/issues/4558)) - @c3y1huang @chriscchien
 - [FEATURE] Make FS format options configurable in storage class ([4642](https://github.com/longhorn/longhorn/issues/4642)) - @weizhe0422 @chriscchien
 ## Improvement
 - [IMPROVEMENT] Change the script into a docker run command mentioned in 'recovery from longhorn backup without system installed' doc ([1521](https://github.com/longhorn/longhorn/issues/1521)) - @weizhe0422 @chriscchien
 - [IMPROVEMENT] Improve 'recovery from longhorn backup without system installed' doc. ([1522](https://github.com/longhorn/longhorn/issues/1522)) - @weizhe0422 @roger-ryao
 - [IMPROVEMENT] Dump NFS ganesha logs to pod stdout ([2380](https://github.com/longhorn/longhorn/issues/2380)) - @weizhe0422 @roger-ryao
 - [IMPROVEMENT] Support failed/obsolete orphaned backup cleanup ([3898](https://github.com/longhorn/longhorn/issues/3898)) - @mantissahz @chriscchien
 - [IMPROVEMENT] liveness and readiness probes with longhorn csi plugin daemonset ([3907](https://github.com/longhorn/longhorn/issues/3907)) - @c3y1huang @roger-ryao
 - [IMPROVEMENT] Longhorn doesn't reuse failed replica on a disk with full allocated space ([3921](https://github.com/longhorn/longhorn/issues/3921)) - @PhanLe1010 @chriscchien
 - [IMPROVEMENT] Reduce syscalls while reading and writing requests in longhorn-engine (engine <-> replica) ([4122](https://github.com/longhorn/longhorn/issues/4122)) - @yangchiu @derekbit
 - [IMPROVEMENT] Reduce read and write calls in liblonghorn (tgt <-> engine) ([4133](https://github.com/longhorn/longhorn/issues/4133)) - @derekbit
 - [IMPROVEMENT] Replace the GCC allocator in liblonghorn with a more efficient memory allocator ([4136](https://github.com/longhorn/longhorn/issues/4136)) - @yangchiu @derekbit
 - [DOC] Update Helm readme and document ([4175](https://github.com/longhorn/longhorn/issues/4175)) - @derekbit
 - [IMPROVEMENT] Purging a volume before rebuilding starts ([4183](https://github.com/longhorn/longhorn/issues/4183)) - @yangchiu @shuo-wu
 - [IMPROVEMENT] Schedule volumes based on available disk space ([4185](https://github.com/longhorn/longhorn/issues/4185)) - @yangchiu @c3y1huang
 - [IMPROVEMENT] Recognize default toleration and node selector to allow Longhorn run on the RKE mixed cluster ([4246](https://github.com/longhorn/longhorn/issues/4246)) - @c3y1huang @chriscchien
 - [IMPROVEMENT] Support bundle doesn't collect the snapshot yamls ([4285](https://github.com/longhorn/longhorn/issues/4285)) - @yangchiu @PhanLe1010
 - [IMPROVEMENT] Avoid accidentally deleting engine images that are still in use ([4332](https://github.com/longhorn/longhorn/issues/4332)) - @derekbit @chriscchien
 - [IMPROVEMENT] Show non-JSON error from backup store ([4336](https://github.com/longhorn/longhorn/issues/4336)) - @c3y1huang
 - [IMPROVEMENT] Update nfs-ganesha to v4.0 ([4351](https://github.com/longhorn/longhorn/issues/4351)) - @derekbit
 - [IMPROVEMENT] show error when failed to init frontend ([4362](https://github.com/longhorn/longhorn/issues/4362)) - @c3y1huang
 - [IMPROVEMENT]  Too many debug-level log messages in engine instance-manager ([4427](https://github.com/longhorn/longhorn/issues/4427)) - @derekbit @chriscchien
 - [IMPROVEMENT] Add prep work for fixing the corrupted filesystem using fsck in KB ([4440](https://github.com/longhorn/longhorn/issues/4440)) - @derekbit
 - [IMPROVEMENT] Prevent users from accidentally uninstalling Longhorn ([4509](https://github.com/longhorn/longhorn/issues/4509)) - @yangchiu @PhanLe1010
 - [IMPROVEMENT] add possibility to use nodeSelector on the storageClass ([4574](https://github.com/longhorn/longhorn/issues/4574)) - @weizhe0422 @roger-ryao
 - [IMPROVEMENT] Check if node schedulable condition is set before trying to read it ([4581](https://github.com/longhorn/longhorn/issues/4581)) - @weizhe0422 @roger-ryao
 - [IMPROVEMENT] Review/consolidate  the sectorSize in replica server, replica volume, and engine ([4599](https://github.com/longhorn/longhorn/issues/4599)) - @yangchiu @derekbit
 - [IMPROVEMENT] Reorganize longhorn-manager/k8s/patches and auto-generate preserveUnknownFields field ([4600](https://github.com/longhorn/longhorn/issues/4600)) - @yangchiu @derekbit
 - [IMPROVEMENT] share-manager pod bypasses the kubernetes scheduler ([4789](https://github.com/longhorn/longhorn/issues/4789)) - @joshimoo @chriscchien
 - [IMPROVEMENT] Unify the format of returned error messages in longhorn-engine ([4828](https://github.com/longhorn/longhorn/issues/4828)) - @derekbit
 - [IMPROVEMENT] Longhorn system backup/restore UI ([4855](https://github.com/longhorn/longhorn/issues/4855)) - @smallteeths
 - [IMPROVEMENT] Replace the modTime (mtime) with ctime in snapshot hash ([4934](https://github.com/longhorn/longhorn/issues/4934)) - @derekbit @chriscchien
 - [BUG] volume is stuck in attaching/detaching loop with error `Failed to init frontend: device...` ([4959](https://github.com/longhorn/longhorn/issues/4959)) - @derekbit @PhanLe1010 @chriscchien
 - [IMPROVEMENT] Affinity in the longhorn-ui deployment within the helm chart ([4987](https://github.com/longhorn/longhorn/issues/4987)) - @mantissahz @chriscchien
 - [IMPROVEMENT] Allow users to change volume.spec.snapshotDataIntegrity on UI ([4994](https://github.com/longhorn/longhorn/issues/4994)) - @yangchiu @smallteeths
 - [IMPROVEMENT] Backup and restore recurring jobs on UI ([5009](https://github.com/longhorn/longhorn/issues/5009)) - @smallteeths @chriscchien
 - [IMPROVEMENT] Disable `Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly` for RWX volumes ([5017](https://github.com/longhorn/longhorn/issues/5017)) - @derekbit @chriscchien
 - [IMPROVEMENT] Enable fast replica rebuilding by default ([5023](https://github.com/longhorn/longhorn/issues/5023)) - @derekbit @roger-ryao
 - [IMPROVEMENT] Upgrade tcmalloc in longhorn-engine ([5050](https://github.com/longhorn/longhorn/issues/5050)) - @derekbit
 - [IMPROVEMENT] UI show error when backup target is empty for system backup ([5056](https://github.com/longhorn/longhorn/issues/5056)) - @smallteeths @khushboo-rancher
 - [IMPROVEMENT] System restore job name should be Longhorn prefixed ([5057](https://github.com/longhorn/longhorn/issues/5057)) - @c3y1huang @khushboo-rancher
 - [BUG] Error in logs while restoring the system backup ([5061](https://github.com/longhorn/longhorn/issues/5061)) - @c3y1huang @chriscchien
 - [IMPROVEMENT] Add warning message to when deleting the restoring backups ([5065](https://github.com/longhorn/longhorn/issues/5065)) - @smallteeths @khushboo-rancher @roger-ryao
 - [IMPROVEMENT] Inconsistent name convention across volume backup restore and system backup restore ([5066](https://github.com/longhorn/longhorn/issues/5066)) - @smallteeths @roger-ryao
 - [IMPROVEMENT] System restore should proceed to restore other volumes if restoring one volume keeps failing for a certain time. ([5086](https://github.com/longhorn/longhorn/issues/5086)) - @c3y1huang @khushboo-rancher @roger-ryao
 - [IMPROVEMENT] Support customized number of replicas of webhook and recovery-backend ([5087](https://github.com/longhorn/longhorn/issues/5087)) - @derekbit @chriscchien
 - [IMPROVEMENT] Simplify the page by placing some configuration items in the advanced configuration when creating the volume ([5090](https://github.com/longhorn/longhorn/issues/5090)) - @yangchiu @smallteeths
 - [IMPROVEMENT] Support replica sync client timeout setting to stabilize replica rebuilding ([5110](https://github.com/longhorn/longhorn/issues/5110)) - @derekbit @chriscchien
 - [IMPROVEMENT] Set a newly created volume's data integrity from UI to `ignored` rather than `Fast-Check`. ([5126](https://github.com/longhorn/longhorn/issues/5126)) - @yangchiu @smallteeths
 ## Performance
 - [BUG] Turn a node down and up, workload takes longer time to come back online in Longhorn v1.2.0 ([2947](https://github.com/longhorn/longhorn/issues/2947)) - @yangchiu @PhanLe1010
 - [TASK] RWX volume performance measurement and investigation ([3665](https://github.com/longhorn/longhorn/issues/3665)) - @derekbit
 - [TASK] Verify spinning disk/HDD via the current e2e regression ([4182](https://github.com/longhorn/longhorn/issues/4182)) - @yangchiu
 - [BUG] test_csi_snapshot_snap_create_volume_from_snapshot failed when using HDD as Longhorn disks ([4227](https://github.com/longhorn/longhorn/issues/4227)) - @yangchiu @PhanLe1010
 - [TASK] Disable tcmalloc in data path because newer tcmalloc version leads to performance drop ([5096](https://github.com/longhorn/longhorn/issues/5096)) - @derekbit @chriscchien
 ## Stability
 - [BUG] Longhorn won't fail all replicas if there is no valid backend during the engine starting stage ([1330](https://github.com/longhorn/longhorn/issues/1330)) - @derekbit @roger-ryao
 - [BUG] Every other backup fails and crashes the volume (Segmentation Fault) ([1768](https://github.com/longhorn/longhorn/issues/1768)) - @olljanat @mantissahz
 - [BUG] Backend sizes do not match 5368709120 != 10737418240 in the engine initiation phase ([3601](https://github.com/longhorn/longhorn/issues/3601)) - @derekbit @chriscchien
 - [BUG] Somehow the Rebuilding field inside volume.meta is set to true causing the volume to stuck in attaching/detaching loop ([4212](https://github.com/longhorn/longhorn/issues/4212)) - @yangchiu @derekbit
 - [BUG] Engine binary cannot be recovered after being removed accidentally ([4380](https://github.com/longhorn/longhorn/issues/4380)) - @yangchiu @c3y1huang
 - [TASK] Disable tcmalloc in longhorn-engine and longhorn-instance-manager ([5068](https://github.com/longhorn/longhorn/issues/5068)) - @derekbit
 ## Bugs
 - [BUG] Removing old instance records after the new IM pod is launched will take 1 minute ([1363](https://github.com/longhorn/longhorn/issues/1363)) - @mantissahz
 - [BUG] Restoring volume stuck forever if the backup is already deleted. ([1867](https://github.com/longhorn/longhorn/issues/1867)) - @mantissahz @chriscchien
 - [BUG] Duplicated default instance manager leads to engine/replica cannot be started ([3000](https://github.com/longhorn/longhorn/issues/3000)) - @PhanLe1010 @roger-ryao
 - [BUG] Restore from backup sometimes failed if having high frequent recurring backup job w/ retention ([3055](https://github.com/longhorn/longhorn/issues/3055)) - @mantissahz @roger-ryao
 - [BUG] Newly created backup stays in `InProgress` when the volume deleted before backup finished ([3122](https://github.com/longhorn/longhorn/issues/3122)) - @mantissahz @chriscchien
 - [Bug] Degraded volume generate failed replica make volume unschedulable ([3220](https://github.com/longhorn/longhorn/issues/3220)) - @derekbit @chriscchien
 - [BUG] The default access mode of a restored RWX volume is RWO ([3444](https://github.com/longhorn/longhorn/issues/3444)) - @weizhe0422 @roger-ryao
 - [BUG] Replica rebuilding failure with error "Replica must be closed, Can not add in state: open" ([3828](https://github.com/longhorn/longhorn/issues/3828)) - @mantissahz @roger-ryao
 - [BUG] Max length of volume name not consist between frontend and backend ([3917](https://github.com/longhorn/longhorn/issues/3917)) - @weizhe0422 @roger-ryao
 - [BUG] Can't delete volumesnapshot if backup removed first  ([4107](https://github.com/longhorn/longhorn/issues/4107)) - @weizhe0422 @chriscchien
 - [BUG] A IM-proxy connection not closed in full regression 1.3 ([4113](https://github.com/longhorn/longhorn/issues/4113)) - @c3y1huang @chriscchien
 - [BUG] Scale replica warning ([4120](https://github.com/longhorn/longhorn/issues/4120)) - @c3y1huang @chriscchien
 - [BUG] Wrong nodeOrDiskEvicted collected in node monitor ([4143](https://github.com/longhorn/longhorn/issues/4143)) - @yangchiu @derekbit
 - [BUG] Misleading log "BUG: replica is running but storage IP is empty" ([4153](https://github.com/longhorn/longhorn/issues/4153)) - @shuo-wu @chriscchien
 - [BUG] longhorn-manager cannot start while upgrading if the configmap contains volume sensitive settings ([4160](https://github.com/longhorn/longhorn/issues/4160)) - @derekbit @chriscchien
 - [BUG] Replica stuck in buggy state with status.currentState is error and the spec.desireState is running ([4197](https://github.com/longhorn/longhorn/issues/4197)) - @yangchiu @PhanLe1010
 - [BUG] After updating longhorn to version 1.3.0, only 1 node had problems and I can't even delete it ([4213](https://github.com/longhorn/longhorn/issues/4213)) - @derekbit @c3y1huang @chriscchien
 - [BUG] Unable to use a TTY error when running environment_check.sh ([4216](https://github.com/longhorn/longhorn/issues/4216)) - @flkdnt @chriscchien
 - [BUG] The last healthy replica may be evicted or removed ([4238](https://github.com/longhorn/longhorn/issues/4238)) - @yangchiu @shuo-wu
 - [BUG] Volume detaching and attaching repeatedly while creating multiple snapshots with a same id ([4250](https://github.com/longhorn/longhorn/issues/4250)) - @yangchiu @derekbit
 - [BUG] Backing image is not deleted and recreated correctly ([4256](https://github.com/longhorn/longhorn/issues/4256)) - @shuo-wu @chriscchien
 - [BUG] longhorn-ui fails to start on RKE2 with cis-1.6 profile for Longhorn v1.3.0 with helm install ([4266](https://github.com/longhorn/longhorn/issues/4266)) - @yangchiu @mantissahz
 - [BUG] Longhorn volume stuck in deleting state ([4278](https://github.com/longhorn/longhorn/issues/4278)) - @yangchiu @PhanLe1010
 - [BUG]  the IP address is duplicate when using storage network and the second network is contronllerd by ovs-cni. ([4281](https://github.com/longhorn/longhorn/issues/4281)) - @mantissahz
 - [BUG] build longhorn-ui image error ([4283](https://github.com/longhorn/longhorn/issues/4283)) - @smallteeths
 - [BUG] Wrong conditions in the Chart default-setting manifest for Rancher deployed Windows Cluster feature ([4289](https://github.com/longhorn/longhorn/issues/4289)) - @derekbit @chriscchien
 - [BUG] Volume operations/rebuilding error during eviction ([4294](https://github.com/longhorn/longhorn/issues/4294)) - @yangchiu @shuo-wu
 - [BUG] longhorn-manager deletes same pod multi times when rebooting ([4302](https://github.com/longhorn/longhorn/issues/4302)) - @mantissahz @w13915984028
 - [BUG] test_setting_backing_image_auto_cleanup failed because the backing image file isn't deleted on the corresponding node as expected ([4308](https://github.com/longhorn/longhorn/issues/4308)) - @shuo-wu @chriscchien
 - [BUG] After automatically force delete terminating pods of deployment on down node, data lost and I/O error ([4384](https://github.com/longhorn/longhorn/issues/4384)) - @yangchiu @derekbit @PhanLe1010
 - [BUG] Volume can not attach to node when engine image DaemonSet pods are not fully deployed ([4386](https://github.com/longhorn/longhorn/issues/4386)) - @PhanLe1010 @chriscchien
 - [BUG] Error/warning during uninstallation of Longhorn v1.3.1 via manifest ([4405](https://github.com/longhorn/longhorn/issues/4405)) - @PhanLe1010 @roger-ryao
 - [BUG] can't upgrade engine if a volume was created in Longhorn v1.0 and the volume.spec.dataLocality is `""` ([4412](https://github.com/longhorn/longhorn/issues/4412)) - @derekbit @chriscchien
 - [BUG] Confusing description the label for replica delition ([4430](https://github.com/longhorn/longhorn/issues/4430)) - @yangchiu @smallteeths
 - [BUG] Update the Longhorn document in Using the Environment Check Script ([4450](https://github.com/longhorn/longhorn/issues/4450)) - @weizhe0422 @roger-ryao
 - [BUG] Unable to search 1.3.1 doc by algolia ([4457](https://github.com/longhorn/longhorn/issues/4457)) - @mantissahz @roger-ryao
 - [BUG] Misleading message "The volume is in expansion progress from size 20Gi to 10Gi" if the expansion is invalid ([4475](https://github.com/longhorn/longhorn/issues/4475)) - @yangchiu @smallteeths
 - [BUG] Flaky case test_autosalvage_with_data_locality_enabled ([4489](https://github.com/longhorn/longhorn/issues/4489)) - @weizhe0422
 - [BUG] Continuously rebuild when auto-balance==least-effort and existing node becomes unschedulable ([4502](https://github.com/longhorn/longhorn/issues/4502)) - @yangchiu @c3y1huang
 - [BUG] Inconsistent system snapshots between replicas after rebuilding ([4513](https://github.com/longhorn/longhorn/issues/4513)) - @derekbit
 - [BUG] Prometheus metric for backup state (longhorn_backup_state) returns wrong values ([4521](https://github.com/longhorn/longhorn/issues/4521)) - @mantissahz @roger-ryao
 - [BUG] Longhorn accidentally schedule all replicas onto a worker node even though the setting Replica Node Level Soft Anti-Affinity is currently disabled ([4546](https://github.com/longhorn/longhorn/issues/4546)) - @yangchiu @mantissahz
 - [BUG] LH continuously reports `invalid customized default setting taint-toleration` ([4554](https://github.com/longhorn/longhorn/issues/4554)) - @weizhe0422 @roger-ryao
 - [BUG] the values.yaml in the longhorn helm chart contains values not used. ([4601](https://github.com/longhorn/longhorn/issues/4601)) - @weizhe0422 @roger-ryao
 - [BUG] longhorn-engine integration test test_restore_to_file_with_backing_file failed after upgrade to sles 15.4 ([4632](https://github.com/longhorn/longhorn/issues/4632)) - @mantissahz
 - [BUG] Can not pull a backup created by another Longhorn system from the remote backup target ([4637](https://github.com/longhorn/longhorn/issues/4637)) - @yangchiu @mantissahz @roger-ryao
 - [BUG] Fix the share-manager deletion failure if the confimap is not existing ([4648](https://github.com/longhorn/longhorn/issues/4648)) - @derekbit @roger-ryao
 - [BUG] Updating volume-scheduling-error failure for RWX volumes and expanding volumes ([4654](https://github.com/longhorn/longhorn/issues/4654)) - @derekbit @chriscchien
 - [BUG] charts/longhorn/questions.yaml include oudated csi-image tags ([4669](https://github.com/longhorn/longhorn/issues/4669)) - @PhanLe1010 @roger-ryao
 - [BUG] rebuilding the replica failed after upgrading from 1.2.4 to 1.3.2-rc2 ([4705](https://github.com/longhorn/longhorn/issues/4705)) - @derekbit @chriscchien
 - [BUG] Cannot re-run helm uninstallation if the first one failed and cannot fetch logs of failed uninstallation pod ([4711](https://github.com/longhorn/longhorn/issues/4711)) - @yangchiu @PhanLe1010 @roger-ryao
 - [BUG] The old instance-manager-r Pods are not deleted after upgrade ([4726](https://github.com/longhorn/longhorn/issues/4726)) - @mantissahz @chriscchien
 - [BUG] Replica Auto Balance repeatedly delete the local replica and trigger rebuilding ([4761](https://github.com/longhorn/longhorn/issues/4761)) - @c3y1huang @roger-ryao
 - [BUG] Volume metafile getting deleted or empty results in a detach-attach loop ([4846](https://github.com/longhorn/longhorn/issues/4846)) - @mantissahz @chriscchien
 - [BUG] Backing image is stuck at `in-progress` status if the provided checksum is incorrect ([4852](https://github.com/longhorn/longhorn/issues/4852)) - @FrankYang0529 @chriscchien
 - [BUG] Duplicate channel close error in the backing image manage related components ([4865](https://github.com/longhorn/longhorn/issues/4865)) - @weizhe0422 @roger-ryao
 - [BUG] The node ID of backing image data source somehow get changed then lead to file handling failed ([4887](https://github.com/longhorn/longhorn/issues/4887)) - @shuo-wu @chriscchien
 - [BUG] Cannot upload a backing image larger than 10G ([4902](https://github.com/longhorn/longhorn/issues/4902)) - @smallteeths @shuo-wu @chriscchien
 - [BUG] Failed to build longhorn-instance-manager master branch ([4946](https://github.com/longhorn/longhorn/issues/4946)) - @derekbit
 - [BUG] PVC only works with plural annotation `volumes.kubernetes.io/storage-provisioner: driver.longhorn.io` ([4951](https://github.com/longhorn/longhorn/issues/4951)) - @weizhe0422
 - [BUG] Failed to create a replenished replica process because of the newly adding option ([4962](https://github.com/longhorn/longhorn/issues/4962)) - @yangchiu @derekbit
 - [BUG] Incorrect log messages in longhorn-engine processRemoveSnapshot() ([4980](https://github.com/longhorn/longhorn/issues/4980)) - @derekbit
 - [BUG] System backup showing wrong age ([5047](https://github.com/longhorn/longhorn/issues/5047)) - @smallteeths @khushboo-rancher
 - [BUG] System backup should validate empty backup target ([5055](https://github.com/longhorn/longhorn/issues/5055)) - @c3y1huang @khushboo-rancher
 - [BUG] missing the `restoreVolumeRecurringJob` parameter in the VolumeGet API ([5062](https://github.com/longhorn/longhorn/issues/5062)) - @mantissahz @roger-ryao
 - [BUG] System restore stuck in restoring if pvc exists with identical name ([5064](https://github.com/longhorn/longhorn/issues/5064)) - @c3y1huang @roger-ryao
 - [BUG] No error shown on UI if system backup conf not available ([5072](https://github.com/longhorn/longhorn/issues/5072)) - @c3y1huang @khushboo-rancher
 - [BUG] System restore missing services ([5074](https://github.com/longhorn/longhorn/issues/5074)) - @yangchiu @c3y1huang
 - [BUG] In a system restore, PV & PVC are not restored if PVC was created with 'longhorn-static' (created via Longhorn GUI) ([5091](https://github.com/longhorn/longhorn/issues/5091)) - @c3y1huang @khushboo-rancher
 - [BUG][v1.4.0-rc1] image security scan CRITICAL issues ([5107](https://github.com/longhorn/longhorn/issues/5107)) - @yangchiu @mantissahz
 - [BUG] Snapshot trim wrong label in the volume detail page. ([5127](https://github.com/longhorn/longhorn/issues/5127)) - @smallteeths @chriscchien
 - [BUG] Filesystem on the volume with a backing image is corrupted after applying trim operation ([5129](https://github.com/longhorn/longhorn/issues/5129)) - @derekbit @chriscchien
 - [BUG] Error in uninstall job ([5132](https://github.com/longhorn/longhorn/issues/5132)) - @c3y1huang @chriscchien
 - [BUG] Uninstall job unable to delete the systembackup and systemrestore cr. ([5133](https://github.com/longhorn/longhorn/issues/5133)) - @c3y1huang @chriscchien
 - [BUG] Nil pointer dereference error on restoring the system backup ([5134](https://github.com/longhorn/longhorn/issues/5134)) - @yangchiu @c3y1huang
 - [BUG] UI option Update Replicas Auto Balance should use capital letter like others ([5154](https://github.com/longhorn/longhorn/issues/5154)) - @smallteeths @chriscchien
 - [BUG] System restore cannot roll out when volume name is different to the PV ([5157](https://github.com/longhorn/longhorn/issues/5157)) - @yangchiu @c3y1huang
 - [BUG] Online expansion doesn't succeed after a failed expansion ([5169](https://github.com/longhorn/longhorn/issues/5169)) - @derekbit @shuo-wu @khushboo-rancher
 ## Misc
 - [DOC] RWX support for NVIDIA JETSON Ubuntu 18.4LTS kernel requires enabling NFSV4.1 ([3157](https://github.com/longhorn/longhorn/issues/3157)) - @yangchiu @derekbit
 - [DOC] Add information about encryption algorithm to documentation ([3285](https://github.com/longhorn/longhorn/issues/3285)) - @mantissahz
 - [DOC] Update the doc of volume size after introducing snapshot prune ([4158](https://github.com/longhorn/longhorn/issues/4158)) - @shuo-wu
 - [Doc] Update the outdated "Customizing Default Settings" document ([4174](https://github.com/longhorn/longhorn/issues/4174)) - @derekbit
 - [TASK] Refresh distro version support for 1.4 ([4401](https://github.com/longhorn/longhorn/issues/4401)) - @weizhe0422
 - [TASK] Update official document Longhorn Networking ([4478](https://github.com/longhorn/longhorn/issues/4478)) - @derekbit
 - [TASK] Update preserveUnknownFields fields in longhorn-manager CRD manifest  ([4505](https://github.com/longhorn/longhorn/issues/4505)) - @derekbit @roger-ryao
 - [TASK] Disable doc search for archived versions < 1.1 ([4524](https://github.com/longhorn/longhorn/issues/4524)) - @mantissahz
 - [TASK] Update longhorn components with the latest backupstore ([4552](https://github.com/longhorn/longhorn/issues/4552)) - @derekbit
 - [TASK] Update base image of all components from BCI 15.3 to 15.4 ([4617](https://github.com/longhorn/longhorn/issues/4617)) - @yangchiu
 - [DOC] Update the Longhorn document in Install with Helm ([4745](https://github.com/longhorn/longhorn/issues/4745)) - @roger-ryao
 - [TASK] Create longhornio support-bundle-kit image ([4911](https://github.com/longhorn/longhorn/issues/4911)) - @yangchiu
 - [DOC] Add Recurring * Jobs History Limit to setting reference ([4912](https://github.com/longhorn/longhorn/issues/4912)) - @weizhe0422 @roger-ryao
 - [DOC] Add Failed Backup TTL to setting reference ([4913](https://github.com/longhorn/longhorn/issues/4913)) - @mantissahz
 - [TASK] Create longhornio liveness probe image ([4945](https://github.com/longhorn/longhorn/issues/4945)) - @yangchiu
 - [TASK] Make system managed components branch-based build ([5024](https://github.com/longhorn/longhorn/issues/5024)) - @yangchiu
 - [TASK] Remove unstable s390x from PR check for all repos ([5040](https://github.com/longhorn/longhorn/issues/5040)) -
 - [TASK] Update longhorn-share-manager's nfs-ganesha to V4.2.1 ([5083](https://github.com/longhorn/longhorn/issues/5083)) - @derekbit @mantissahz
 - [DOC] Update the Longhorn document in Setting up Prometheus and Grafana ([5158](https://github.com/longhorn/longhorn/issues/5158)) - @roger-ryao
 ## Contributors
 - @FrankYang0529
 - @PhanLe1010
 - @c3y1huang
 - @chriscchien
 - @derekbit
 - @flkdnt
 - @innobead
 - @joshimoo
 - @khushboo-rancher
 - @mantissahz
 - @olljanat
 - @roger-ryao
 - @shuo-wu
 - @smallteeths
 - @w13915984028
 - @weizhe0422
 - @yangchiu
--- a/CHANGELOG/CHANGELOG-1.4.1.md
+++ b/CHANGELOG/CHANGELOG-1.4.1.md
@ -0,0 +1,88 @@
 ## Release Note
 **v1.4.1 released!** 🎆
 This release introduces improvements and bug fixes as described below about stability, performance, space efficiency, resilience, and so on. Please try it and feedback. Thanks for all the contributions!
 ## Installation
 > **Please ensure your Kubernetes cluster is at least v1.21 before installing Longhorn v1.4.1.**
 Longhorn supports 3 installation ways including Rancher App Marketplace, Kubectl, and Helm. Follow the installation instructions [here](https://longhorn.io/docs/1.4.1/deploy/install/).
 ## Upgrade
 > **Please ensure your Kubernetes cluster is at least v1.21 before upgrading to Longhorn v1.4.1 from v1.3.x/v1.4.0, which are only supported source versions.**
 Follow the upgrade instructions [here](https://longhorn.io/docs/1.4.1/deploy/upgrade/).
 ## Deprecation & Incompatibilities
 N/A
 ## Known Issues after Release
 Please follow up on [here](https://github.com/longhorn/longhorn/wiki/Outstanding-Known-Issues-of-Releases) about any outstanding issues found after this release.
 ## Highlights
 - [IMPROVEMENT] Periodically clean up volume snapshots ([3836](https://github.com/longhorn/longhorn/issues/3836)) - @c3y1huang @chriscchien
 ## Improvement
 - [IMPROVEMENT] Do not count the failure replica reuse failure caused by the disconnection ([1923](https://github.com/longhorn/longhorn/issues/1923)) - @yangchiu @mantissahz
 - [IMPROVEMENT] Update uninstallation info to include the 'Deleting Confirmation Flag'  in chart ([5250](https://github.com/longhorn/longhorn/issues/5250)) - @PhanLe1010 @roger-ryao
 - [IMPROVEMENT] Fix Guaranteed Engine Manager CPU recommendation formula in UI ([5338](https://github.com/longhorn/longhorn/issues/5338)) - @c3y1huang @smallteeths @roger-ryao
 - [IMPROVEMENT] Update PSP validation in the Longhorn upstream chart  ([5339](https://github.com/longhorn/longhorn/issues/5339)) - @yangchiu @PhanLe1010
 - [IMPROVEMENT] Update ganesha nfs to 4.2.3 ([5356](https://github.com/longhorn/longhorn/issues/5356)) - @derekbit @roger-ryao
 - [IMPROVEMENT] Set write-cache of longhorn block device to off explicitly ([5382](https://github.com/longhorn/longhorn/issues/5382)) - @derekbit @chriscchien
 ## Stability
 - [BUG] Memory leak in CSI plugin caused by stuck umount processes if the RWX volume is already gone ([5296](https://github.com/longhorn/longhorn/issues/5296)) - @derekbit @roger-ryao
 - [BUG] share-manager pod failed to restart after kubelet restart ([5507](https://github.com/longhorn/longhorn/issues/5507)) - @yangchiu @derekbit
 ## Bugs
 - [BUG] Longhorn 1.3.2 fails to backup & restore volumes behind Internet proxy  ([5054](https://github.com/longhorn/longhorn/issues/5054)) - @mantissahz @chriscchien
 - [BUG] RWX doesn't work with release 1.4.0 due to end grace update error from recovery backend ([5183](https://github.com/longhorn/longhorn/issues/5183)) - @derekbit @chriscchien
 - [BUG] Incorrect indentation of charts/questions.yaml ([5196](https://github.com/longhorn/longhorn/issues/5196)) - @mantissahz @roger-ryao
 - [BUG] Updating option "Allow snapshots removal during trim" for old volumes failed  ([5218](https://github.com/longhorn/longhorn/issues/5218)) - @shuo-wu @roger-ryao
 - [BUG] Incorrect router retry mechanism ([5259](https://github.com/longhorn/longhorn/issues/5259)) - @mantissahz @chriscchien
 - [BUG] System Backup is stuck at Uploading if there are PVs not provisioned by CSI driver ([5286](https://github.com/longhorn/longhorn/issues/5286)) - @c3y1huang @chriscchien
 - [BUG] Sync up with backup target during DR volume activation ([5292](https://github.com/longhorn/longhorn/issues/5292)) - @yangchiu @weizhe0422
 - [BUG] environment_check.sh does not handle different kernel versions in cluster correctly ([5304](https://github.com/longhorn/longhorn/issues/5304)) - @achims311 @roger-ryao
 - [BUG] instance-manager-r high memory consumption ([5312](https://github.com/longhorn/longhorn/issues/5312)) - @derekbit @roger-ryao
 - [BUG] Replica rebuilding caused by rke2/kubelet restart ([5340](https://github.com/longhorn/longhorn/issues/5340)) - @derekbit @chriscchien
 - [BUG] Error message not consistent between create/update recurring job when retain number greater than 50 ([5434](https://github.com/longhorn/longhorn/issues/5434)) - @c3y1huang @chriscchien
 - [BUG] Do not copy Host header to API requests forwarded to Longhorn Manager ([5438](https://github.com/longhorn/longhorn/issues/5438)) - @yangchiu @smallteeths
 - [BUG] RWX Volume attachment is getting Failed ([5456](https://github.com/longhorn/longhorn/issues/5456)) - @derekbit
 - [BUG] test case test_backup_lock_deletion_during_restoration failed ([5458](https://github.com/longhorn/longhorn/issues/5458)) - @yangchiu @derekbit
 - [BUG] [master] [v1.4.1-rc1] Volume restoration will never complete if attached node is down ([5464](https://github.com/longhorn/longhorn/issues/5464)) - @derekbit @weizhe0422 @chriscchien
 - [BUG] Unable to create support bundle agent pod in air-gap environment ([5467](https://github.com/longhorn/longhorn/issues/5467)) - @yangchiu @c3y1huang
 - [BUG] Node disconnection test failed ([5476](https://github.com/longhorn/longhorn/issues/5476)) - @yangchiu @derekbit
 - [BUG] Physical node down test failed ([5477](https://github.com/longhorn/longhorn/issues/5477)) - @derekbit @chriscchien
 - [BUG] Backing image with sync failure ([5481](https://github.com/longhorn/longhorn/issues/5481)) - @ChanYiLin @roger-ryao
 - [BUG] Example of data migration doesn't work for hidden/./dot-files) ([5484](https://github.com/longhorn/longhorn/issues/5484)) - @hedefalk @shuo-wu @chriscchien
 - [BUG] test case test_dr_volume_with_backup_block_deletion failed ([5489](https://github.com/longhorn/longhorn/issues/5489)) - @yangchiu @derekbit
 ## Misc
 - [TASK][UI] add new recurring job tasks ([5272](https://github.com/longhorn/longhorn/issues/5272)) - @smallteeths @chriscchien
 ## Contributors
 - @ChanYiLin
 - @PhanLe1010
 - @achims311
 - @c3y1huang
 - @chriscchien
 - @derekbit
 - @hedefalk
 - @innobead
 - @mantissahz
 - @roger-ryao
 - @shuo-wu
 - @smallteeths
 - @weizhe0422
 - @yangchiu
--- a/CHANGELOG/CHANGELOG-1.4.2.md
+++ b/CHANGELOG/CHANGELOG-1.4.2.md
@ -0,0 +1,92 @@
 ## Release Note
 ### **v1.4.2 released!** 🎆
 Longhorn v1.4.2 is the latest stable version of Longhorn 1.4.
 It introduces improvements and bug fixes in the areas of stability, performance, space efficiency, resilience, and so on. Please try it out and provide feedback. Thanks for all the contributions!
 > For the definition of stable or latest release, please check [here](https://github.com/longhorn/longhorn#releases).
 ## Installation
 > **Please ensure your Kubernetes cluster is at least v1.21 before installing v1.4.2.**
 Longhorn supports 3 installation ways including Rancher App Marketplace, Kubectl, and Helm. Follow the installation instructions [here](https://longhorn.io/docs/1.4.2/deploy/install/).
 ## Upgrade
 > **Please read the [important notes](https://longhorn.io/docs/1.4.2/deploy/important-notes/) first and ensure your Kubernetes cluster is at least v1.21 before upgrading to Longhorn v1.4.2 from v1.3.x/v1.4.x, which are only supported source versions.**
 Follow the upgrade instructions [here](https://longhorn.io/docs/1.4.2/deploy/upgrade/).
 ## Deprecation & Incompatibilities
 N/A
 ## Known Issues after Release
 Please follow up on [here](https://github.com/longhorn/longhorn/wiki/Outstanding-Known-Issues-of-Releases) about any outstanding issues found after this release.
 ## Highlights
  - [IMPROVEMENT] Use PDB to protect Longhorn components from unexpected drains ([3304](https://github.com/longhorn/longhorn/issues/3304)) - @yangchiu @PhanLe1010
  - [IMPROVEMENT] Introduce timeout mechanism for the sparse file syncing service ([4305](https://github.com/longhorn/longhorn/issues/4305)) - @yangchiu @ChanYiLin
  - [IMPROVEMENT] Recurring jobs create new snapshots while being not able to clean up old ones ([4898](https://github.com/longhorn/longhorn/issues/4898)) - @mantissahz @chriscchien
 ## Improvement
  - [IMPROVEMENT] Support bundle collects dmesg, syslog and related information of longhorn nodes ([5073](https://github.com/longhorn/longhorn/issues/5073)) - @weizhe0422 @roger-ryao
  - [IMPROVEMENT] Fix BackingImage uploading/downloading flow to prevent client timeout ([5443](https://github.com/longhorn/longhorn/issues/5443)) - @ChanYiLin @chriscchien
  - [IMPROVEMENT] Create a new setting so that Longhorn removes PDB for instance-manager-r that doesn't have any running instance inside it ([5549](https://github.com/longhorn/longhorn/issues/5549)) - @PhanLe1010 @khushboo-rancher
  - [IMPROVEMENT] Deprecate the setting `allow-node-drain-with-last-healthy-replica` and replace it by `node-drain-policy` setting ([5585](https://github.com/longhorn/longhorn/issues/5585)) - @yangchiu @PhanLe1010
  - [IMPROVEMENT][UI] Recurring jobs create new snapshots while being not able to clean up old one ([5610](https://github.com/longhorn/longhorn/issues/5610)) - @mantissahz @smallteeths @roger-ryao
  - [IMPROVEMENT] Only activate replica if it doesn't have deletion timestamp during volume engine upgrade ([5632](https://github.com/longhorn/longhorn/issues/5632)) - @PhanLe1010 @roger-ryao
  - [IMPROVEMENT] Clean up backup target if the backup target setting is unset ([5655](https://github.com/longhorn/longhorn/issues/5655)) - @yangchiu @ChanYiLin
 ## Resilience
  - [BUG] Directly mark replica as failed if the node is deleted ([5542](https://github.com/longhorn/longhorn/issues/5542)) - @weizhe0422 @roger-ryao
  - [BUG] RWX volume is stuck at detaching when the attached node is down  ([5558](https://github.com/longhorn/longhorn/issues/5558)) - @derekbit @roger-ryao
  - [BUG] Backup monitor gets stuck in an infinite loop if backup isn't found ([5662](https://github.com/longhorn/longhorn/issues/5662)) - @derekbit @chriscchien
  - [BUG] Resources such as replicas are somehow not mutated when network is unstable  ([5762](https://github.com/longhorn/longhorn/issues/5762)) - @derekbit @roger-ryao
  - [BUG] Instance manager may not update instance status for a minute after starting ([5809](https://github.com/longhorn/longhorn/issues/5809)) - @ejweber @chriscchien
 ## Bugs
  - [BUG] Delete a uploading backing image, the corresponding LH temp file is not deleted ([3682](https://github.com/longhorn/longhorn/issues/3682)) - @ChanYiLin @chriscchien
  - [BUG] Can not create backup in engine image not fully deployed cluster ([5248](https://github.com/longhorn/longhorn/issues/5248)) - @ChanYiLin @roger-ryao
  - [BUG] Upgrade engine --> spec.restoreVolumeRecurringJob and spec.snapshotDataIntegrity Unsupported value ([5485](https://github.com/longhorn/longhorn/issues/5485)) - @yangchiu @derekbit
  - [BUG] Bulk backup deletion cause restoring volume to finish with attached state. ([5506](https://github.com/longhorn/longhorn/issues/5506)) - @ChanYiLin @roger-ryao
  - [BUG] volume expansion starts for no reason, gets stuck on current size > expected size ([5513](https://github.com/longhorn/longhorn/issues/5513)) - @mantissahz @roger-ryao
  - [BUG] RWX volume attachment failed if tried more enough times ([5537](https://github.com/longhorn/longhorn/issues/5537)) - @yangchiu @derekbit
  - [BUG] instance-manager-e emits `Wait for process pvc-xxxx to shutdown` constantly ([5575](https://github.com/longhorn/longhorn/issues/5575)) - @derekbit @roger-ryao
  - [BUG] Support bundle kit should respect node selector & taint toleration ([5614](https://github.com/longhorn/longhorn/issues/5614)) - @yangchiu @c3y1huang
  - [BUG] Value overlapped in page Instance Manager Image ([5622](https://github.com/longhorn/longhorn/issues/5622)) - @smallteeths @chriscchien
  - [BUG] Instance manager PDB created with wrong selector thus blocking the draining of the wrongly selected node forever ([5680](https://github.com/longhorn/longhorn/issues/5680)) - @PhanLe1010 @chriscchien
  - [BUG] During volume live engine upgrade, if the replica pod is killed, the volume is stuck in upgrading forever ([5684](https://github.com/longhorn/longhorn/issues/5684)) - @yangchiu @PhanLe1010
  - [BUG] Instance manager PDBs cannot be removed if the longhorn-manager pod on its spec node is not available ([5688](https://github.com/longhorn/longhorn/issues/5688)) - @PhanLe1010 @roger-ryao
  - [BUG] Rebuild rebuilding is possibly issued to a wrong replica ([5709](https://github.com/longhorn/longhorn/issues/5709)) - @ejweber @roger-ryao
  - [BUG] longhorn upgrade is not upgrading engineimage ([5740](https://github.com/longhorn/longhorn/issues/5740)) - @shuo-wu @chriscchien
  - [BUG] `test_replica_auto_balance_when_replica_on_unschedulable_node` Error in creating volume with nodeSelector and dataLocality parameters ([5745](https://github.com/longhorn/longhorn/issues/5745)) - @c3y1huang @roger-ryao
  - [BUG] Unable to backup volume after NFS server IP change ([5856](https://github.com/longhorn/longhorn/issues/5856)) - @derekbit @roger-ryao
 ## Misc
  - [TASK] Check and update the networking doc & example YAMLs ([5651](https://github.com/longhorn/longhorn/issues/5651)) - @yangchiu @shuo-wu
 ## Contributors
 - @ChanYiLin
 - @PhanLe1010
 - @c3y1huang
 - @chriscchien
 - @derekbit
 - @ejweber
 - @innobead
 - @khushboo-rancher
 - @mantissahz
 - @roger-ryao
 - @shuo-wu
 - @smallteeths
 - @weizhe0422
 - @yangchiu
--- a/CHANGELOG/CHANGELOG-1.4.3.md
+++ b/CHANGELOG/CHANGELOG-1.4.3.md
@ -0,0 +1,74 @@
 ## Release Note
 ### **v1.4.3 released!** 🎆
 Longhorn v1.4.3 is the latest stable version of Longhorn 1.4.
 It introduces improvements and bug fixes in the areas of stability, resilience, and so on. Please try it out and provide feedback. Thanks for all the contributions!
 > For the definition of stable or latest release, please check [here](https://github.com/longhorn/longhorn#releases).
 ## Installation
 > **Please ensure your Kubernetes cluster is at least v1.21 before installing v1.4.3.**
 Longhorn supports 3 installation ways including Rancher App Marketplace, Kubectl, and Helm. Follow the installation instructions [here](https://longhorn.io/docs/1.4.3/deploy/install/).
 ## Upgrade
 > **Please read the [important notes](https://longhorn.io/docs/1.4.3/deploy/important-notes/) first and ensure your Kubernetes cluster is at least v1.21 before upgrading to Longhorn v1.4.3 from v1.3.x/v1.4.x, which are only supported source versions.**
 Follow the upgrade instructions [here](https://longhorn.io/docs/1.4.3/deploy/upgrade/).
 ## Deprecation & Incompatibilities
 N/A
 ## Known Issues after Release
 Please follow up on [here](https://github.com/longhorn/longhorn/wiki/Outstanding-Known-Issues-of-Releases) about any outstanding issues found after this release.
 ## Improvement
  - [IMPROVEMENT] Assign the pods to the same node where the strict-local volume is present ([5448](https://github.com/longhorn/longhorn/issues/5448)) - @c3y1huang @chriscchien
 ## Resilience
  - [BUG] filesystem corrupted after delete instance-manager-r for a locality best-effort volume ([5801](https://github.com/longhorn/longhorn/issues/5801)) - @yangchiu @ChanYiLin @mantissahz
 ## Bugs
  - [BUG] 'Upgrade Engine' still shows up in a specific situation when engine already upgraded ([3063](https://github.com/longhorn/longhorn/issues/3063)) - @weizhe0422 @PhanLe1010 @smallteeths
  - [BUG] DR volume even after activation remains in standby mode if there are one or more failed replicas. ([3069](https://github.com/longhorn/longhorn/issues/3069)) - @yangchiu @mantissahz
  - [BUG] Prevent Longhorn uninstallation from getting stuck due to backups in error ([5868](https://github.com/longhorn/longhorn/issues/5868)) - @ChanYiLin @mantissahz
  - [BUG]  Unable to create support bundle if the previous one stayed in ReadyForDownload phase ([5882](https://github.com/longhorn/longhorn/issues/5882)) - @c3y1huang @roger-ryao
  - [BUG] share-manager for a given pvc keep restarting (other pvc are working fine) ([5954](https://github.com/longhorn/longhorn/issues/5954)) - @yangchiu @derekbit
  - [BUG] Replica auto-rebalance doesn't respect node selector ([5971](https://github.com/longhorn/longhorn/issues/5971)) - @c3y1huang @roger-ryao
  - [BUG] Extra snapshot generated when clone from a detached volume ([5986](https://github.com/longhorn/longhorn/issues/5986)) - @weizhe0422 @ejweber
  - [BUG] User created snapshot deleted after node drain and uncordon ([5992](https://github.com/longhorn/longhorn/issues/5992)) - @yangchiu @mantissahz
  - [BUG] In some specific situation, system backup auto deleted when creating another one ([6045](https://github.com/longhorn/longhorn/issues/6045)) - @c3y1huang @chriscchien
  - [BUG] Backing Image deletion stuck if it's deleted during uploading process and bids is ready-for-transfer state ([6086](https://github.com/longhorn/longhorn/issues/6086)) - @WebberHuang1118 @chriscchien
  - [BUG] Backing image manager fails when SELinux is enabled ([6108](https://github.com/longhorn/longhorn/issues/6108)) - @ejweber @chriscchien
  - [BUG] test_dr_volume_with_restore_command_error failed ([6130](https://github.com/longhorn/longhorn/issues/6130)) - @mantissahz @roger-ryao
  - [BUG] Longhorn doesn't remove the system backups crd on uninstallation ([6185](https://github.com/longhorn/longhorn/issues/6185)) - @c3y1huang @khushboo-rancher
  - [BUG] Test case test_ha_backup_deletion_recovery failed in rhel or rockylinux arm64 environment ([6213](https://github.com/longhorn/longhorn/issues/6213)) - @yangchiu @ChanYiLin @mantissahz
  - [BUG] Engine continues to attempt to rebuild replica while detaching ([6217](https://github.com/longhorn/longhorn/issues/6217)) - @yangchiu @ejweber
  - [BUG] Unable to receive support bundle from UI when it's large (400MB+) ([6256](https://github.com/longhorn/longhorn/issues/6256)) - @c3y1huang @chriscchien 
  - [BUG] Migration test case failed: unable to detach volume migration is not ready yet ([6238](https://github.com/longhorn/longhorn/issues/6238)) - @yangchiu @PhanLe1010 @khushboo-rancher
  - [BUG] Restored Volumes stuck in attaching state ([6239](https://github.com/longhorn/longhorn/issues/6239)) - @derekbit @roger-ryao
 ## Contributors
 - @ChanYiLin
 - @PhanLe1010
 - @WebberHuang1118
 - @c3y1huang
 - @chriscchien
 - @derekbit
 - @ejweber
 - @innobead
 - @khushboo-rancher
 - @mantissahz
 - @roger-ryao
 - @smallteeths
 - @weizhe0422
 - @yangchiu
--- a/CHANGELOG/CHANGELOG-1.5.0.md
+++ b/CHANGELOG/CHANGELOG-1.5.0.md
@ -0,0 +1,301 @@
 ## Release Note
 ### **v1.5.0 released!** 🎆
 Longhorn v1.5.0 is the latest version of Longhorn 1.5.
 It introduces many enhancements, improvements, and bug fixes as described below including performance, stability, maintenance, resilience, and so on. Please try it and feedback. Thanks for all the contributions!
 > For the definition of stable or latest release, please check [here](https://github.com/longhorn/longhorn#releases).
  - [v2 Data Engine based on SPDK - Preview](https://github.com/longhorn/longhorn/issues/5751)
    > **Please note that this is a preview feature, so should not be used in any production environment. A preview feature is disabled by default and would be changed in the following versions until it becomes general availability.**
    In addition to the existing iSCSI stack (v1) data engine, we are introducing the v2 data engine based on SPDK (Storage Performance Development Kit). This release includes the introduction of volume lifecycle management, degraded volume handling, offline replica rebuilding, block device management, and orphaned replica management. For the performance benchmark and comparison with v1, check the report [here](https://longhorn.io/docs/1.5.0/spdk/performance-benchmark/).
  - [Longhorn Volume Attachment](https://github.com/longhorn/longhorn/issues/3715)
    Introducing the new Longhorn VolumeAttachment CR, which ensures exclusive attachment and supports automatic volume attachment and detachment for various headless operations such as volume cloning, backing image export, and recurring jobs.
  - [Cluster Autoscaler - GA](https://github.com/longhorn/longhorn/issues/5238)
    Cluster Autoscaler was initially introduced as an experimental feature in v1.3. After undergoing automatic validation on different public cloud Kubernetes distributions and receiving user feedback, it has now reached general availability.
  - [Instance Manager Engine & Replica Consolidation](https://github.com/longhorn/longhorn/issues/5208)
    Previously, there were two separate instance manager pods responsible for volume engine and replica process management. However, this setup required high resource usage, especially during live upgrades. In this release, we have merged these pods into a single instance manager, reducing the initial resource requirements.
  - [Volume Backup Compression Methods](https://github.com/longhorn/longhorn/issues/5189)
    Longhorn supports different compression methods for volume backups, including lz4, gzip, or no compression. This allows users to choose the most suitable method based on their data type and usage requirements.
  - [Automatic Volume Trim Recurring Job](https://github.com/longhorn/longhorn/issues/5186)
    While volume filesystem trim was introduced in v1.4, users had to perform the operation manually. From this release, users can create a recurring job that automatically runs the trim process, improving space efficiency without requiring human intervention.
  - [RWX Volume Trim](https://github.com/longhorn/longhorn/issues/5143)
    Longhorn supports filesystem trim for RWX (Read-Write-Many) volumes, expanding the trim functionality beyond RWO (Read-Write-Once) volumes only.
  - [Upgrade Path Enforcement & Downgrade Prevention](https://github.com/longhorn/longhorn/issues/5131)
    To ensure compatibility after an upgrade, we have implemented upgrade path enforcement. This prevents unintended downgrades and ensures the system and data remain intact.
  - [Backing Image Management via CSI VolumeSnapshot](https://github.com/longhorn/longhorn/issues/5005)
    Users can now utilize the unified CSI VolumeSnapshot interface to manage Backing Images similar to volume snapshots and backups.
  - [Snapshot Cleanup & Delete Recurring Job](https://github.com/longhorn/longhorn/issues/3836)
    Introducing two new recurring job types specifically designed for snapshot cleanup and deletion. These jobs allow users to remove unnecessary snapshots for better space efficiency.
  - [CIFS Backup Store](https://github.com/longhorn/longhorn/issues/3599) & [Azure Backup Store](https://github.com/longhorn/longhorn/issues/1309)
    To enhance users' backup strategies and align with data governance policies, Longhorn now supports additional backup storage protocols, including CIFS and Azure.
  - [Kubernetes Upgrade Node Drain Policy](https://github.com/longhorn/longhorn/issues/3304)  
    The new Node Drain Policy provides flexible strategies to protect volume data during Kubernetes upgrades or node maintenance operations. This ensures the integrity and availability of your volumes.
 ## Installation
 > **Please ensure your Kubernetes cluster is at least v1.21 before installing Longhorn v1.5.0.**
 Longhorn supports 3 installation ways including Rancher App Marketplace, Kubectl, and Helm. Follow the installation instructions [here](https://longhorn.io/docs/1.5.0/deploy/install/).
 ## Upgrade
 > **Please ensure your Kubernetes cluster is at least v1.21 before upgrading to Longhorn v1.5.0 from v1.4.x. Only support upgrading from 1.4.x.**
 Follow the upgrade instructions [here](https://longhorn.io/docs/1.5.0/deploy/upgrade/).
 ## Deprecation & Incompatibilities
 Please check the [important notes](https://longhorn.io/docs/1.5.0/deploy/important-notes/) to know more about deprecated, removed, incompatible features and important changes. If you upgrade indirectly from an older version like v1.3.x, please also check the corresponding important note for each upgrade version path.
 ## Known Issues after Release
 Please follow up on [here](https://github.com/longhorn/longhorn/wiki/Outstanding-Known-Issues-of-Releases) about any outstanding issues found after this release. 
 ## Highlights
  - [DOC] Provide the user guide for Kubernetes upgrade ([494](https://github.com/longhorn/longhorn/issues/494)) - @PhanLe1010
  - [FEATURE] Backups to Azure Blob Storage ([1309](https://github.com/longhorn/longhorn/issues/1309)) - @mantissahz @chriscchien
  - [IMPROVEMENT] Use PDB to protect Longhorn components from unexpected drains ([3304](https://github.com/longhorn/longhorn/issues/3304)) - @yangchiu @PhanLe1010
  - [FEATURE] CIFS Backup Store Support ([3599](https://github.com/longhorn/longhorn/issues/3599)) - @derekbit @chriscchien
  - [IMPROVEMENT] Consolidate volume attach/detach implementation ([3715](https://github.com/longhorn/longhorn/issues/3715)) - @yangchiu @PhanLe1010
  - [IMPROVEMENT] Periodically clean up volume snapshots ([3836](https://github.com/longhorn/longhorn/issues/3836)) - @c3y1huang @chriscchien
  - [IMPROVEMENT] Introduce timeout mechanism for the sparse file syncing service ([4305](https://github.com/longhorn/longhorn/issues/4305)) - @yangchiu @ChanYiLin
  - [IMPROVEMENT] Recurring jobs create new snapshots while being not able to clean up old ones ([4898](https://github.com/longhorn/longhorn/issues/4898)) - @mantissahz @chriscchien
  - [FEATURE] BackingImage Management via VolumeSnapshot ([5005](https://github.com/longhorn/longhorn/issues/5005)) - @ChanYiLin @chriscchien
  - [FEATURE] Upgrade path enforcement & downgrade prevention ([5131](https://github.com/longhorn/longhorn/issues/5131)) - @yangchiu @mantissahz
  - [FEATURE] Support RWX volume trim ([5143](https://github.com/longhorn/longhorn/issues/5143)) - @derekbit @chriscchien
  - [FEATURE] Auto Trim via recurring job ([5186](https://github.com/longhorn/longhorn/issues/5186)) - @c3y1huang @chriscchien
  - [FEATURE] Introduce faster compression and multiple threads for volume backup & restore  ([5189](https://github.com/longhorn/longhorn/issues/5189)) - @derekbit @roger-ryao
  - [FEATURE] Consolidate Instance Manager Engine & Replica for resource consumption reduction ([5208](https://github.com/longhorn/longhorn/issues/5208)) - @yangchiu @c3y1huang
  - [FEATURE] Cluster Autoscaler Support GA ([5238](https://github.com/longhorn/longhorn/issues/5238)) - @yangchiu @c3y1huang
  - [FEATURE] Update K8s version support and component/pkg/build dependencies for Longhorn 1.5 ([5595](https://github.com/longhorn/longhorn/issues/5595)) - @yangchiu @ejweber
  - [FEATURE] Support SPDK Data Engine - Preview ([5751](https://github.com/longhorn/longhorn/issues/5751)) - @derekbit @shuo-wu @DamiaSan
 ## Enhancements
  - [FEATURE] Allow users to directly activate a restoring/DR volume as long as there is one ready replica. ([1512](https://github.com/longhorn/longhorn/issues/1512)) - @mantissahz @weizhe0422
  - [REFACTOR] volume controller refactoring/split up, to simplify the control flow ([2527](https://github.com/longhorn/longhorn/issues/2527)) - @PhanLe1010 @chriscchien
  - [FEATURE] Import and export SPDK longhorn volumes to longhorn sparse file directory ([4100](https://github.com/longhorn/longhorn/issues/4100)) - @DamiaSan
  - [FEATURE] Add a global `storage reserved` setting for newly created longhorn nodes' disks ([4773](https://github.com/longhorn/longhorn/issues/4773)) - @mantissahz @chriscchien
  - [FEATURE] Support backup volumes during system backup ([5011](https://github.com/longhorn/longhorn/issues/5011)) - @c3y1huang @chriscchien
  - [FEATURE] Support SPDK lvol shallow copy for newly replica creation ([5217](https://github.com/longhorn/longhorn/issues/5217)) - @DamiaSan
  - [FEATURE] Introduce longhorn-spdk-engine for SPDK volume management ([5282](https://github.com/longhorn/longhorn/issues/5282)) - @shuo-wu
  - [FEATURE] Support replica-zone-soft-anti-affinity setting per volume ([5358](https://github.com/longhorn/longhorn/issues/5358)) - @ChanYiLin @smallteeths @chriscchien
  - [FEATURE] Install Opt-In NetworkPolicies  ([5403](https://github.com/longhorn/longhorn/issues/5403)) - @yangchiu @ChanYiLin
  - [FEATURE] Create Longhorn SPDK Engine component with basic fundamental functions ([5406](https://github.com/longhorn/longhorn/issues/5406)) - @shuo-wu
  - [FEATURE] Add status APIs for shallow copy and IO pause/resume ([5647](https://github.com/longhorn/longhorn/issues/5647)) - @DamiaSan
  - [FEATURE] Introduce a new disk type, disk management and replica scheduler for SPDK volumes ([5683](https://github.com/longhorn/longhorn/issues/5683)) - @derekbit @roger-ryao
  - [FEATURE] Support replica scheduling for SPDK volume  ([5711](https://github.com/longhorn/longhorn/issues/5711)) - @derekbit
  - [FEATURE] Create SPDK gRPC service for instance manager ([5712](https://github.com/longhorn/longhorn/issues/5712)) - @shuo-wu
  - [FEATURE] Environment check script for Longhorn with SPDK ([5738](https://github.com/longhorn/longhorn/issues/5738)) - @derekbit @chriscchien
  - [FEATURE] Deployment manifests for helping install SPDK dependencies, utilities and libraries ([5739](https://github.com/longhorn/longhorn/issues/5739)) - @yangchiu @derekbit
  - [FEATURE] Implement Disk gRPC Service in Instance Manager for collecting SPDK disk statistics from SPDK gRPC service  ([5744](https://github.com/longhorn/longhorn/issues/5744)) - @derekbit @chriscchien
  - [FEATURE] Support for SPDK RAID1 by setting the minimum number of base_bdevs to 1 ([5758](https://github.com/longhorn/longhorn/issues/5758)) - @yangchiu @DamiaSan
  - [FEATURE] Add a global setting for enabling and disabling SPDK feature ([5778](https://github.com/longhorn/longhorn/issues/5778)) - @yangchiu @derekbit
  - [FEATURE] Identify and manage orphaned lvols and raid bdevs if the associated `Volume` resources are not existing ([5827](https://github.com/longhorn/longhorn/issues/5827)) - @yangchiu @derekbit
  - [FEATURE] Longhorn UI for SPDK feature ([5846](https://github.com/longhorn/longhorn/issues/5846)) - @smallteeths @chriscchien
  - [FEATURE] UI modification to work with new AD mechanism (Longhorn UI -> Longhorn API) ([6004](https://github.com/longhorn/longhorn/issues/6004)) - @yangchiu @smallteeths
  - [FEATURE] Replica offline rebuild over SPDK - data engine ([6067](https://github.com/longhorn/longhorn/issues/6067)) - @shuo-wu
  - [FEATURE] Support automatic offline replica rebuilding of volumes using SPDK data engine ([6071](https://github.com/longhorn/longhorn/issues/6071)) - @yangchiu @derekbit
 ## Improvement
  - [IMPROVEMENT] Do not count the failure replica reuse failure caused by the disconnection ([1923](https://github.com/longhorn/longhorn/issues/1923)) - @yangchiu @mantissahz
  - [IMPROVEMENT] Consider changing the over provisioning default/recommendation to 100% percentage (no over provisioning) ([2694](https://github.com/longhorn/longhorn/issues/2694)) - @c3y1huang @chriscchien
  - [BUG] StorageClass of pv and pvc of a recovered pv should not always be default. ([3506](https://github.com/longhorn/longhorn/issues/3506)) - @ChanYiLin @smallteeths @roger-ryao
  - [IMPROVEMENT] Auto-attach volume for K8s CSI snapshot ([3726](https://github.com/longhorn/longhorn/issues/3726)) - @weizhe0422 @PhanLe1010
  - [IMPROVEMENT] Change Longhorn API to create/delete snapshot CRs instead of calling engine CLI ([3995](https://github.com/longhorn/longhorn/issues/3995)) - @yangchiu @PhanLe1010
  - [IMPROVEMENT] Add support for crypto parameters for RWX volumes ([4829](https://github.com/longhorn/longhorn/issues/4829)) - @mantissahz @roger-ryao
  - [IMPROVEMENT] Remove the global setting `mkfs-ext4-parameters` ([4914](https://github.com/longhorn/longhorn/issues/4914)) - @ejweber @roger-ryao
  - [IMPROVEMENT] Move all snapshot related settings at one place. ([4930](https://github.com/longhorn/longhorn/issues/4930)) - @smallteeths @roger-ryao
  - [IMPROVEMENT] Remove system managed component image settings  ([5028](https://github.com/longhorn/longhorn/issues/5028)) - @mantissahz @chriscchien
  - [IMPROVEMENT] Set default `engine-replica-timeout` value for engine controller start command ([5031](https://github.com/longhorn/longhorn/issues/5031)) - @derekbit @chriscchien
  - [IMPROVEMENT] Support bundle collects dmesg, syslog and related information of longhorn nodes ([5073](https://github.com/longhorn/longhorn/issues/5073)) - @weizhe0422 @roger-ryao
  - [IMPROVEMENT] Collect volume, system, feature info for metrics for better usage awareness ([5235](https://github.com/longhorn/longhorn/issues/5235)) - @c3y1huang @chriscchien @roger-ryao
  - [IMPROVEMENT] Update uninstallation info to include the 'Deleting Confirmation Flag'  in chart ([5250](https://github.com/longhorn/longhorn/issues/5250)) - @PhanLe1010 @roger-ryao
  - [IMPROVEMENT] Disable Revision Counter for Strict-Local dataLocality ([5257](https://github.com/longhorn/longhorn/issues/5257)) - @derekbit @roger-ryao
  - [IMPROVEMENT] Fix Guaranteed Engine Manager CPU recommendation formula in UI ([5338](https://github.com/longhorn/longhorn/issues/5338)) - @c3y1huang @smallteeths @roger-ryao
  - [IMPROVEMENT] Update PSP validation in the Longhorn upstream chart  ([5339](https://github.com/longhorn/longhorn/issues/5339)) - @yangchiu @PhanLe1010
  - [IMPROVEMENT] Update ganesha nfs to 4.2.3 ([5356](https://github.com/longhorn/longhorn/issues/5356)) - @derekbit @roger-ryao
  - [IMPROVEMENT] Set write-cache of longhorn block device to off explicitly ([5382](https://github.com/longhorn/longhorn/issues/5382)) - @derekbit @chriscchien
  - [IMPROVEMENT] Clean up unused backupstore mountpoint ([5391](https://github.com/longhorn/longhorn/issues/5391)) - @derekbit @chriscchien
  - [DOC] Update Kubernetes version info to have consistent description from the longhorn documentation in chart ([5399](https://github.com/longhorn/longhorn/issues/5399)) - @ChanYiLin @roger-ryao
  - [IMPROVEMENT] Fix BackingImage uploading/downloading flow to prevent client timeout ([5443](https://github.com/longhorn/longhorn/issues/5443)) - @ChanYiLin @chriscchien
  - [IMPROVEMENT] Assign the pods to the same node where the strict-local volume is present ([5448](https://github.com/longhorn/longhorn/issues/5448)) - @c3y1huang @chriscchien
  - [IMPROVEMENT] Have explicitly message when trying to attach a volume which it's engine and replica were on deleted node  ([5545](https://github.com/longhorn/longhorn/issues/5545)) - @ChanYiLin @chriscchien
  - [IMPROVEMENT] Create a new setting so that Longhorn removes PDB for instance-manager-r that doesn't have any running instance inside it ([5549](https://github.com/longhorn/longhorn/issues/5549)) - @PhanLe1010 @roger-ryao
  - [IMPROVEMENT] Merge conversion/admission webhook and recovery backend services into longhorn-manager ([5590](https://github.com/longhorn/longhorn/issues/5590)) - @ChanYiLin @chriscchien
  - [IMPROVEMENT][UI] Recurring jobs create new snapshots while being not able to clean up old one ([5610](https://github.com/longhorn/longhorn/issues/5610)) - @mantissahz @smallteeths @roger-ryao
  - [IMPROVEMENT] Only activate replica if it doesn't have deletion timestamp during volume engine upgrade ([5632](https://github.com/longhorn/longhorn/issues/5632)) - @PhanLe1010 @roger-ryao
  - [IMPROVEMENT] Clean up backup target if the backup target setting is unset ([5655](https://github.com/longhorn/longhorn/issues/5655)) - @yangchiu @ChanYiLin
  - [IMPROVEMENT] Bump CSI sidecar components' version ([5672](https://github.com/longhorn/longhorn/issues/5672)) - @yangchiu @ejweber
  - [IMPROVEMENT] Configure log level of Longhorn components ([5888](https://github.com/longhorn/longhorn/issues/5888)) - @ChanYiLin @weizhe0422
  - [IMPROVEMENT] Remove development toolchain from Longhorn images ([6022](https://github.com/longhorn/longhorn/issues/6022)) - @ChanYiLin @derekbit
  - [IMPROVEMENT] Reduce replica process's number of allocated ports  ([6079](https://github.com/longhorn/longhorn/issues/6079)) - @ChanYiLin @derekbit
  - [IMPROVEMENT] UI supports automatic replica rebuilding for SPDK volumes ([6107](https://github.com/longhorn/longhorn/issues/6107)) - @smallteeths @roger-ryao
  - [IMPROVEMENT] Minor UX changes for Longhorn SPDK ([6126](https://github.com/longhorn/longhorn/issues/6126)) - @derekbit @roger-ryao
  - [IMPROVEMENT] Instance manager spdk_tgt resilience due to spdk_tgt crash ([6155](https://github.com/longhorn/longhorn/issues/6155)) - @yangchiu @derekbit
  - [IMPROVEMENT] Determine number of replica/engine port count in longhorn-manager (control plane) instead ([6163](https://github.com/longhorn/longhorn/issues/6163)) - @derekbit @chriscchien
  - [IMPROVEMENT] SPDK client should functions after encountering decoding error ([6191](https://github.com/longhorn/longhorn/issues/6191)) - @yangchiu @shuo-wu
 ## Performance
  - [REFACTORING] Evaluate the impact of removing the client side compression for backup blocks ([1409](https://github.com/longhorn/longhorn/issues/1409)) - @derekbit
 ## Resilience
  - [BUG] If backing image downloading fails on one node, it doesn't try on other nodes. ([3746](https://github.com/longhorn/longhorn/issues/3746)) - @ChanYiLin
  - [BUG] Replica rebuilding caused by rke2/kubelet restart ([5340](https://github.com/longhorn/longhorn/issues/5340)) - @derekbit @chriscchien
  - [BUG] Volume restoration will never complete if attached node is down ([5464](https://github.com/longhorn/longhorn/issues/5464)) - @derekbit @weizhe0422 @chriscchien
  - [BUG] Node disconnection test failed ([5476](https://github.com/longhorn/longhorn/issues/5476)) - @yangchiu @derekbit
  - [BUG] Physical node down test failed ([5477](https://github.com/longhorn/longhorn/issues/5477)) - @derekbit @chriscchien
  - [BUG] Backing image with sync failure ([5481](https://github.com/longhorn/longhorn/issues/5481)) - @ChanYiLin @roger-ryao
  - [BUG] share-manager pod failed to restart after kubelet restart ([5507](https://github.com/longhorn/longhorn/issues/5507)) - @yangchiu @derekbit
  - [BUG] Directly mark replica as failed if the node is deleted ([5542](https://github.com/longhorn/longhorn/issues/5542)) - @weizhe0422 @roger-ryao
  - [BUG] RWX volume is stuck at detaching when the attached node is down  ([5558](https://github.com/longhorn/longhorn/issues/5558)) - @derekbit @roger-ryao
  - [BUG] Unable to export RAID1 bdev in degraded state  ([5650](https://github.com/longhorn/longhorn/issues/5650)) - @chriscchien @DamiaSan
  - [BUG] Backup monitor gets stuck in an infinite loop if backup isn't found ([5662](https://github.com/longhorn/longhorn/issues/5662)) - @derekbit @chriscchien
  - [BUG] Resources such as replicas are somehow not mutated when network is unstable  ([5762](https://github.com/longhorn/longhorn/issues/5762)) - @derekbit @roger-ryao
  - [BUG] filesystem corrupted after delete instance-manager-r for a locality best-effort volume ([5801](https://github.com/longhorn/longhorn/issues/5801)) - @yangchiu @ChanYiLin @mantissahz
 ## Stability
  - [BUG] nfs backup broken - NFS server: mkdir - file exists ([4626](https://github.com/longhorn/longhorn/issues/4626)) - @yangchiu @derekbit
  - [BUG] Memory leak in CSI plugin caused by stuck umount processes if the RWX volume is already gone ([5296](https://github.com/longhorn/longhorn/issues/5296)) - @derekbit @roger-ryao
 ## Bugs
  - [BUG] 'Upgrade Engine' still shows up in a specific situation when engine already upgraded ([3063](https://github.com/longhorn/longhorn/issues/3063)) - @weizhe0422 @PhanLe1010 @smallteeths
  - [BUG] DR volume even after activation remains in standby mode if there are one or more failed replicas. ([3069](https://github.com/longhorn/longhorn/issues/3069)) - @yangchiu @mantissahz
  - [BUG] volume not able to attach with raw type backing image ([3437](https://github.com/longhorn/longhorn/issues/3437)) - @yangchiu @ChanYiLin
  - [BUG] Delete a uploading backing image, the corresponding LH temp file is not deleted ([3682](https://github.com/longhorn/longhorn/issues/3682)) - @ChanYiLin @chriscchien
  - [BUG] Cloned PVC from detached volume will stuck at not ready for workload ([3692](https://github.com/longhorn/longhorn/issues/3692)) - @PhanLe1010 @chriscchien
  - [BUG] Block device volume failed to unmount when it is detached unexpectedly ([3778](https://github.com/longhorn/longhorn/issues/3778)) - @PhanLe1010 @chriscchien
  - [BUG] After migration of Longhorn from Rancher old UI to dashboard, the csi-plugin doesn't update ([4519](https://github.com/longhorn/longhorn/issues/4519)) - @mantissahz @roger-ryao
  - [BUG] Volumes Stuck in Attach/Detach Loop when running on OpenShift/OKD ([4988](https://github.com/longhorn/longhorn/issues/4988)) - @ChanYiLin
  - [BUG] Longhorn 1.3.2 fails to backup & restore volumes behind Internet proxy  ([5054](https://github.com/longhorn/longhorn/issues/5054)) - @mantissahz @chriscchien
  - [BUG] Instance manager pod does not respect of node taint? ([5161](https://github.com/longhorn/longhorn/issues/5161)) - @ejweber
  - [BUG] RWX doesn't work with release 1.4.0 due to end grace update error from recovery backend ([5183](https://github.com/longhorn/longhorn/issues/5183)) - @derekbit @chriscchien
  - [BUG] Incorrect indentation of charts/questions.yaml ([5196](https://github.com/longhorn/longhorn/issues/5196)) - @mantissahz @roger-ryao
  - [BUG] Updating option "Allow snapshots removal during trim" for old volumes failed  ([5218](https://github.com/longhorn/longhorn/issues/5218)) - @shuo-wu @roger-ryao
  - [BUG] Since 1.4.0 RWX volume failing regularly ([5224](https://github.com/longhorn/longhorn/issues/5224)) - @derekbit
  - [BUG] Can not create backup in engine image not fully deployed cluster ([5248](https://github.com/longhorn/longhorn/issues/5248)) - @ChanYiLin @roger-ryao
  - [BUG] Incorrect router retry mechanism ([5259](https://github.com/longhorn/longhorn/issues/5259)) - @mantissahz @chriscchien
  - [BUG] System Backup is stuck at Uploading if there are PVs not provisioned by CSI driver ([5286](https://github.com/longhorn/longhorn/issues/5286)) - @c3y1huang @chriscchien
  - [BUG] Sync up with backup target during DR volume activation ([5292](https://github.com/longhorn/longhorn/issues/5292)) - @yangchiu @weizhe0422
  - [BUG] environment_check.sh does not handle different kernel versions in cluster correctly ([5304](https://github.com/longhorn/longhorn/issues/5304)) - @achims311 @roger-ryao
  - [BUG] instance-manager-r high memory consumption ([5312](https://github.com/longhorn/longhorn/issues/5312)) - @derekbit @roger-ryao
  - [BUG] Unable to upgrade longhorn from v1.3.2 to master-head ([5368](https://github.com/longhorn/longhorn/issues/5368)) - @yangchiu @derekbit
  - [BUG] Modify engineManagerCPURequest and replicaManagerCPURequest won't raise resource request in instance-manager-e pod ([5419](https://github.com/longhorn/longhorn/issues/5419)) - @c3y1huang
  - [BUG] Error message not consistent between create/update recurring job when retain number greater than 50 ([5434](https://github.com/longhorn/longhorn/issues/5434)) - @c3y1huang @chriscchien
  - [BUG] Do not copy Host header to API requests forwarded to Longhorn Manager ([5438](https://github.com/longhorn/longhorn/issues/5438)) - @yangchiu @smallteeths
  - [BUG] RWX Volume attachment is getting Failed ([5456](https://github.com/longhorn/longhorn/issues/5456)) - @derekbit
  - [BUG] test case test_backup_lock_deletion_during_restoration failed ([5458](https://github.com/longhorn/longhorn/issues/5458)) - @yangchiu @derekbit
  - [BUG] Unable to create support bundle agent pod in air-gap environment ([5467](https://github.com/longhorn/longhorn/issues/5467)) - @yangchiu @c3y1huang
  - [BUG] Example of data migration doesn't work for hidden/./dot-files) ([5484](https://github.com/longhorn/longhorn/issues/5484)) - @hedefalk @shuo-wu @chriscchien
  - [BUG] Upgrade engine --> spec.restoreVolumeRecurringJob and spec.snapshotDataIntegrity Unsupported value ([5485](https://github.com/longhorn/longhorn/issues/5485)) - @yangchiu @derekbit
  - [BUG] test case test_dr_volume_with_backup_block_deletion failed ([5489](https://github.com/longhorn/longhorn/issues/5489)) - @yangchiu @derekbit
  - [BUG] Bulk backup deletion cause restoring volume to finish with attached state. ([5506](https://github.com/longhorn/longhorn/issues/5506)) - @ChanYiLin @roger-ryao
  - [BUG] volume expansion starts for no reason, gets stuck on current size > expected size ([5513](https://github.com/longhorn/longhorn/issues/5513)) - @mantissahz @roger-ryao
  - [BUG] RWX volume attachment failed if tried more enough times ([5537](https://github.com/longhorn/longhorn/issues/5537)) - @yangchiu @derekbit
  - [BUG] instance-manager-e emits `Wait for process pvc-xxxx to shutdown` constantly ([5575](https://github.com/longhorn/longhorn/issues/5575)) - @derekbit @roger-ryao
  - [BUG] Support bundle kit should respect node selector & taint toleration ([5614](https://github.com/longhorn/longhorn/issues/5614)) - @yangchiu @c3y1huang
  - [BUG] Value overlapped in page Instance Manager Image ([5622](https://github.com/longhorn/longhorn/issues/5622)) - @smallteeths @chriscchien
  - [BUG] Updated Rocky 9 (and others) can't attach due to SELinux ([5627](https://github.com/longhorn/longhorn/issues/5627)) - @yangchiu @ejweber
  - [BUG] Fix misleading error messages when creating a mount point for a backup store ([5630](https://github.com/longhorn/longhorn/issues/5630)) - @derekbit
  - [BUG] Instance manager PDB created with wrong selector thus blocking the draining of the wrongly selected node forever ([5680](https://github.com/longhorn/longhorn/issues/5680)) - @PhanLe1010 @chriscchien
  - [BUG] During volume live engine upgrade, if the replica pod is killed, the volume is stuck in upgrading forever ([5684](https://github.com/longhorn/longhorn/issues/5684)) - @yangchiu @PhanLe1010
  - [BUG] Instance manager PDBs cannot be removed if the longhorn-manager pod on its spec node is not available ([5688](https://github.com/longhorn/longhorn/issues/5688)) - @PhanLe1010 @roger-ryao
  - [BUG] Rebuild rebuilding is possibly issued to a wrong replica ([5709](https://github.com/longhorn/longhorn/issues/5709)) - @ejweber @roger-ryao
  - [BUG] Observing repilca on new IM-r before upgrading of volume ([5729](https://github.com/longhorn/longhorn/issues/5729)) - @c3y1huang
  - [BUG] longhorn upgrade is not upgrading engineimage ([5740](https://github.com/longhorn/longhorn/issues/5740)) - @shuo-wu @chriscchien
  - [BUG] `test_replica_auto_balance_when_replica_on_unschedulable_node` Error in creating volume with nodeSelector and dataLocality parameters ([5745](https://github.com/longhorn/longhorn/issues/5745)) - @c3y1huang @roger-ryao
  - [BUG] Unable to backup volume after NFS server IP change ([5856](https://github.com/longhorn/longhorn/issues/5856)) - @derekbit @roger-ryao
  - [BUG] Prevent Longhorn uninstallation from getting stuck due to backups in error ([5868](https://github.com/longhorn/longhorn/issues/5868)) - @ChanYiLin @mantissahz
  - [BUG]  Unable to create support bundle if the previous one stayed in ReadyForDownload phase ([5882](https://github.com/longhorn/longhorn/issues/5882)) - @c3y1huang @roger-ryao
  - [BUG] share-manager for a given pvc keep restarting (other pvc are working fine) ([5954](https://github.com/longhorn/longhorn/issues/5954)) - @yangchiu @derekbit
  - [BUG] Replica auto-rebalance doesn't respect node selector ([5971](https://github.com/longhorn/longhorn/issues/5971)) - @c3y1huang @roger-ryao
  - [BUG] Volume detached automatically after upgrade Longhorn ([5983](https://github.com/longhorn/longhorn/issues/5983)) - @yangchiu @PhanLe1010
  - [BUG] Extra snapshot generated when clone from a detached volume ([5986](https://github.com/longhorn/longhorn/issues/5986)) - @weizhe0422 @ejweber
  - [BUG] User created snapshot deleted after node drain and uncordon ([5992](https://github.com/longhorn/longhorn/issues/5992)) - @yangchiu @mantissahz
  - [BUG] Webhook PDBs are not removed after upgrading to master-head ([6026](https://github.com/longhorn/longhorn/issues/6026)) - @weizhe0422 @PhanLe1010
  - [BUG] In some specific situation, system backup auto deleted when creating another one ([6045](https://github.com/longhorn/longhorn/issues/6045)) - @c3y1huang @chriscchien
  - [BUG] Backing Image deletion stuck if it's deleted during uploading process and bids is ready-for-transfer state ([6086](https://github.com/longhorn/longhorn/issues/6086)) - @WebberHuang1118 @chriscchien
  - [BUG] A backup target backed by a Samba server is not recognized ([6100](https://github.com/longhorn/longhorn/issues/6100)) - @derekbit @weizhe0422
  - [BUG] Backing image manager fails when SELinux is enabled ([6108](https://github.com/longhorn/longhorn/issues/6108)) - @ejweber @chriscchien
  - [BUG] Force delete volume make SPDK disk unschedule ([6110](https://github.com/longhorn/longhorn/issues/6110)) - @derekbit
  - [BUG] share-manager terminated during Longhorn upgrading causes rwx volume not working ([6120](https://github.com/longhorn/longhorn/issues/6120)) - @yangchiu @derekbit
  - [BUG] SPDK Volume snapshotList API Error ([6123](https://github.com/longhorn/longhorn/issues/6123)) - @derekbit @chriscchien
  - [BUG] test_recurring_jobs_allow_detached_volume failed ([6124](https://github.com/longhorn/longhorn/issues/6124)) - @ChanYiLin @roger-ryao
  - [BUG] Cron job triggered replica rebuilding keeps repeating itself after corrupting snapshot data ([6129](https://github.com/longhorn/longhorn/issues/6129)) - @yangchiu @mantissahz
  - [BUG] test_dr_volume_with_restore_command_error failed ([6130](https://github.com/longhorn/longhorn/issues/6130)) - @mantissahz @roger-ryao
  - [BUG] RWX volume remains attached after workload deleted if it's upgraded from v1.4.2 ([6139](https://github.com/longhorn/longhorn/issues/6139)) - @PhanLe1010 @chriscchien
  - [BUG] timestamp or checksum not matched in test_snapshot_hash_detect_corruption test case ([6145](https://github.com/longhorn/longhorn/issues/6145)) - @yangchiu @derekbit
  - [BUG] When a v2 volume is attached in maintenance mode, removing a replica will lead to volume stuck in attaching-detaching loop  ([6166](https://github.com/longhorn/longhorn/issues/6166)) - @derekbit @chriscchien
  - [BUG] Misleading offline rebuilding hint if offline rebuilding is not enabled ([6169](https://github.com/longhorn/longhorn/issues/6169)) - @smallteeths @roger-ryao
  - [BUG] Longhorn doesn't remove the system backups crd on uninstallation ([6185](https://github.com/longhorn/longhorn/issues/6185)) - @c3y1huang @khushboo-rancher
  - [BUG] Volume attachment related error logs in uninstaller pod ([6197](https://github.com/longhorn/longhorn/issues/6197)) - @yangchiu @PhanLe1010
  - [BUG] Test case test_ha_backup_deletion_recovery failed in rhel or rockylinux arm64 environment ([6213](https://github.com/longhorn/longhorn/issues/6213)) - @yangchiu @ChanYiLin @mantissahz
  - [BUG] migration test cases could fail due to unexpected volume controllers and replicas status ([6215](https://github.com/longhorn/longhorn/issues/6215)) - @yangchiu @PhanLe1010
  - [BUG] Engine continues to attempt to rebuild replica while detaching ([6217](https://github.com/longhorn/longhorn/issues/6217)) - @yangchiu @ejweber
 ## Misc
  - [TASK] Remove deprecated volume spec recurringJobs and storageClass recurringJobs field ([2865](https://github.com/longhorn/longhorn/issues/2865)) - @c3y1huang @chriscchien
  - [TASK] Remove deprecated fields after CRD API version bump ([3289](https://github.com/longhorn/longhorn/issues/3289)) - @c3y1huang @roger-ryao
  - [TASK] Replace jobq lib with an alternative way for listing remote backup volumes and info ([4176](https://github.com/longhorn/longhorn/issues/4176)) - @ChanYiLin @chriscchien
  - [DOC] Update the Longhorn document in Uninstalling Longhorn using kubectl ([4841](https://github.com/longhorn/longhorn/issues/4841)) - @roger-ryao
  - [TASK] Remove a deprecated feature `disable-replica-rebuild` from longhorn-manager ([4997](https://github.com/longhorn/longhorn/issues/4997)) - @ejweber @chriscchien
  - [TASK]  Update the distro matrix supports on Longhorn docs for 1.5 ([5177](https://github.com/longhorn/longhorn/issues/5177)) - @yangchiu
  - [TASK] Clarify if any upcoming K8s API deprecation/removal will impact Longhorn 1.4 ([5180](https://github.com/longhorn/longhorn/issues/5180)) - @PhanLe1010
  - [TASK] Revert affinity for Longhorn user deployed components ([5191](https://github.com/longhorn/longhorn/issues/5191)) - @weizhe0422 @ejweber
  - [TASK] Add GitHub action for CI to lib repos for supporting dependency bot ([5239](https://github.com/longhorn/longhorn/issues/5239)) - 
  - [DOC] Update the readme of longhorn-spdk-engine about using new Longhorn (RAID1) bdev ([5256](https://github.com/longhorn/longhorn/issues/5256)) - @DamiaSan
  - [TASK][UI] add new recurring job tasks ([5272](https://github.com/longhorn/longhorn/issues/5272)) - @smallteeths @chriscchien
  - [DOC] Update the node maintenance doc to cover upgrade prerequisites for Rancher ([5278](https://github.com/longhorn/longhorn/issues/5278)) - @PhanLe1010
  - [TASK] Run build-engine-test-images automatically when having incompatible engine on master ([5400](https://github.com/longhorn/longhorn/issues/5400)) - @yangchiu
  - [TASK] Update k8s.gcr.io to registry.k8s.io in repos ([5432](https://github.com/longhorn/longhorn/issues/5432)) - @yangchiu
  - [TASK][UI] add new recurring job task - filesystem trim ([5529](https://github.com/longhorn/longhorn/issues/5529)) - @smallteeths @chriscchien
  - doc: update prerequisites in chart readme to make it consistent with documentation v1.3.x ([5531](https://github.com/longhorn/longhorn/pull/5531)) - @ChanYiLin
  - [FEATURE] Remove deprecated `allow-node-drain-with-last-healthy-replica` ([5620](https://github.com/longhorn/longhorn/issues/5620)) - @weizhe0422 @PhanLe1010
  - [FEATURE] Set recurring jobs to PVCs ([5791](https://github.com/longhorn/longhorn/issues/5791)) - @yangchiu @c3y1huang
  - [TASK] Automatically update crds.yaml in longhorn repo from longhorn-manager repo ([5854](https://github.com/longhorn/longhorn/issues/5854)) - @yangchiu
  - [IMPROVEMENT] Remove privilege requirement from lifecycle jobs ([5862](https://github.com/longhorn/longhorn/issues/5862)) - @mantissahz @chriscchien
  - [TASK][UI] support new aio typed instance managers ([5876](https://github.com/longhorn/longhorn/issues/5876)) - @smallteeths @chriscchien
  - [TASK] Remove `Guaranteed Engine Manager CPU`, `Guaranteed Replica Manager CPU`, and `Guaranteed Engine CPU` settings. ([5917](https://github.com/longhorn/longhorn/issues/5917)) - @c3y1huang @roger-ryao
  - [TASK][UI] Support volume backup policy ([6028](https://github.com/longhorn/longhorn/issues/6028)) - @smallteeths @chriscchien
  - [TASK] Reduce BackupConcurrentLimit and RestoreConcurrentLimit default values  ([6135](https://github.com/longhorn/longhorn/issues/6135)) - @derekbit @chriscchien
 ## Contributors
 - @ChanYiLin
 - @DamiaSan
 - @PhanLe1010
 - @WebberHuang1118
 - @achims311
 - @c3y1huang
 - @chriscchien
 - @derekbit
 - @ejweber
 - @hedefalk
 - @innobead
 - @khushboo-rancher
 - @mantissahz
 - @roger-ryao
 - @shuo-wu
 - @smallteeths
 - @weizhe0422
 - @yangchiu
--- a/CHANGELOG/CHANGELOG-1.5.1.md
+++ b/CHANGELOG/CHANGELOG-1.5.1.md
@ -0,0 +1,65 @@
 ## Release Note
 ### **v1.5.1 released!** 🎆
 Longhorn v1.5.1 is the latest version of Longhorn 1.5.
 This release introduces bug fixes as described below about 1.5.0 upgrade issues, stability, troubleshooting and so on. Please try it and feedback. Thanks for all the contributions!
 > For the definition of stable or latest release, please check [here](https://github.com/longhorn/longhorn#releases).
 ## Installation
 > **Please ensure your Kubernetes cluster is at least v1.21 before installing v1.5.1.**
 Longhorn supports 3 installation ways including Rancher App Marketplace, Kubectl, and Helm. Follow the installation instructions [here](https://longhorn.io/docs/1.5.1/deploy/install/).
 ## Upgrade
 > **Please read the [important notes](https://longhorn.io/docs/1.5.1/deploy/important-notes/) first and ensure your Kubernetes cluster is at least v1.21 before upgrading to Longhorn v1.5.1 from v1.4.x/v1.5.0, which are only supported source versions.**
 Follow the upgrade instructions [here](https://longhorn.io/docs/1.5.1/deploy/upgrade/).
 ## Deprecation & Incompatibilities
 N/A
 ## Known Issues after Release
 Please follow up on [here](https://github.com/longhorn/longhorn/wiki/Outstanding-Known-Issues-of-Releases) about any outstanding issues found after this release.
 ## Improvement
  - [IMPROVEMENT] Implement/fix the unit tests of Volume Attachment and volume controller ([6005](https://github.com/longhorn/longhorn/issues/6005)) - @PhanLe1010
  - [QUESTION] Repetitive warnings and errors in a new longhorn setup ([6257](https://github.com/longhorn/longhorn/issues/6257)) - @derekbit @c3y1huang @roger-ryao
 ## Resilience
  - [BUG] 1.5.0 Upgrade: Longhorn conversion webhook server fails ([6259](https://github.com/longhorn/longhorn/issues/6259)) - @derekbit @roger-ryao
  - [BUG] Race leaves snapshot CRs that cannot be deleted ([6298](https://github.com/longhorn/longhorn/issues/6298)) - @yangchiu @PhanLe1010 @ejweber
 ## Bugs
  - [BUG] Engine continues to attempt to rebuild replica while detaching ([6217](https://github.com/longhorn/longhorn/issues/6217)) - @yangchiu @ejweber
  - [BUG] Upgrade to 1.5.0 failed: validator.longhorn.io denied the request if having orphan resources ([6246](https://github.com/longhorn/longhorn/issues/6246)) - @derekbit @roger-ryao
  - [BUG] Unable to receive support bundle from UI when it's large (400MB+) ([6256](https://github.com/longhorn/longhorn/issues/6256)) - @c3y1huang @chriscchien
  - [BUG] Longhorn Manager Pods CrashLoop after upgrade from 1.4.0 to 1.5.0 while backing up volumes ([6264](https://github.com/longhorn/longhorn/issues/6264)) - @ChanYiLin @roger-ryao
  - [BUG] Can not delete type=`bi` VolumeSnapshot if related backing image not exist ([6266](https://github.com/longhorn/longhorn/issues/6266)) - @ChanYiLin @chriscchien
  - [BUG] 1.5.0: AttachVolume.Attach failed for volume, the volume is currently attached to a different node ([6287](https://github.com/longhorn/longhorn/issues/6287)) - @yangchiu @derekbit
  - [BUG] test case test_setting_priority_class failed in master and v1.5.x ([6319](https://github.com/longhorn/longhorn/issues/6319)) - @derekbit @chriscchien
  - [BUG] Unused webhook and recovery backend deployment left in helm chart ([6252](https://github.com/longhorn/longhorn/issues/6252)) - @ChanYiLin @chriscchien
 ## Misc
  - [DOC] v1.5.0 additional outgoing firewall ports need to be opened 9501 9502 9503 ([6317](https://github.com/longhorn/longhorn/issues/6317)) - @ChanYiLin @chriscchien
 ## Contributors
 - @ChanYiLin
 - @PhanLe1010
 - @c3y1huang
 - @chriscchien
 - @derekbit
 - @ejweber
 - @innobead
 - @roger-ryao
 - @yangchiu
--- a/3
+++ b/3
@ -3,5 +3,6 @@ The list of current Longhorn maintainers:
 Name, <Email>, @GitHubHandle
 Sheng Yang, <sheng@yasker.org>, @yasker
 Shuo Wu, <shuo.wu@suse.com>, @shuo-wu
 Joshua Moody, <joshua.moody@suse.com>, @joshimoo
 David Ko, <dko@suse.com>, @innobead
 Derek Su, <derek.su@suse.com>, @derekbit
 Phan Le, <phan.le@suse.com>, @PhanLe1010
--- a/README.md
+++ b/README.md
@ -1,8 +1,20 @@
-# Longhorn
+<h1 align="center" style="border-bottom: none">
    <a href="https://longhorn.io/" target="_blank"><img alt="Longhorn" width="120px" src="https://github.com/longhorn/website/blob/master/static/img/icon-longhorn.svg"></a><br>Longhorn
 </h1>
-Longhorn is a distributed block storage system for Kubernetes. Longhorn is cloud native storage because it is built using Kubernetes and container primitives.
+<p align="center">A CNCF Incubating Project. Visit <a href="https://longhorn.io/" target="_blank">longhorn.io</a> for the full documentation.</p>
-Longhorn is lightweight, reliable, and powerful. You can install Longhorn on an existing Kubernetes cluster with one `kubectl apply` command or using Helm charts. Once Longhorn is installed, it adds persistent volume support to the Kubernetes cluster.
+<div align="center">
 [![Releases](https://img.shields.io/github/release/longhorn/longhorn/all.svg)](https://github.com/longhorn/longhorn/releases)
 [![GitHub](https://img.shields.io/github/license/longhorn/longhorn)](https://github.com/longhorn/longhorn/blob/master/LICENSE)
 [![Docs](https://img.shields.io/badge/docs-latest-green.svg)](https://longhorn.io/docs/latest/)
 </div>
 Longhorn is a distributed block storage system for Kubernetes. Longhorn is cloud-native storage built using Kubernetes and container primitives.
 Longhorn is lightweight, reliable, and powerful. You can install Longhorn on an existing Kubernetes cluster with one `kubectl apply`command or by using Helm charts. Once Longhorn is installed, it adds persistent volume support to the Kubernetes cluster.
 Longhorn implements distributed block storage using containers and microservices. Longhorn creates a dedicated storage controller for each block device volume and synchronously replicates the volume across multiple replicas stored on multiple nodes. The storage controller and replicas are themselves orchestrated using Kubernetes. Here are some notable features of Longhorn:
@ -15,40 +27,37 @@ Longhorn implements distributed block storage using containers and microservices
 You can read more technical details of Longhorn [here](https://longhorn.io/).
-## Current Status
+# Releases
-The latest release of Longhorn is [![Releases](https://img.shields.io/github/release/longhorn/longhorn/all.svg)](https://github.com/longhorn/longhorn/releases)
+> **NOTE**:
 > - __\<version\>*__ means the release branch is under active support and will have periodic follow-up patch releases.
 > - __Latest__ release means the version is the latest release of the newest release branch.
 > - __Stable__ release means the version is stable and has been widely adopted by users.
 https://github.com/longhorn/longhorn/releases
 | Release   | Version | Type           | Release Note (Changelog)                                       | Important Note                                              |
 |-----------|---------|----------------|----------------------------------------------------------------|-------------------------------------------------------------|
 | **1.5***  | 1.5.1   | Latest         | [🔗](https://github.com/longhorn/longhorn/releases/tag/v1.5.1) | [🔗](https://longhorn.io/docs/1.5.1/deploy/important-notes) |
 | **1.4***  | 1.4.4   | Stable         | [🔗](https://github.com/longhorn/longhorn/releases/tag/v1.4.4) | [🔗](https://longhorn.io/docs/1.4.4/deploy/important-notes) |
 | 1.3       | 1.3.3   | Stable         | [🔗](https://github.com/longhorn/longhorn/releases/tag/v1.3.3) | [🔗](https://longhorn.io/docs/1.3.3/deploy/important-notes) |
 | 1.2       | 1.2.6   | Stable         | [🔗](https://github.com/longhorn/longhorn/releases/tag/v1.2.6) | [🔗](https://longhorn.io/docs/1.2.6/deploy/important-notes) |
 | 1.1       | 1.1.3   | Stable         | [🔗](https://github.com/longhorn/longhorn/releases/tag/v1.1.3) |                                                             |
 # Roadmap
 https://github.com/longhorn/longhorn/wiki/Roadmap
 # Components
 Longhorn is 100% open source software. Project source code is spread across a number of repos:
 ## Build Status
 * Engine: [![Build Status](https://drone-publish.longhorn.io/api/badges/longhorn/longhorn-engine/status.svg)](https://drone-publish.longhorn.io/longhorn/longhorn-engine)[![Go Report Card](https://goreportcard.com/badge/github.com/longhorn/longhorn-engine)](https://goreportcard.com/report/github.com/longhorn/longhorn-engine)[![FOSSA Status](https://app.fossa.com/api/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-engine.svg?type=shield)](https://app.fossa.com/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-engine?ref=badge_shield)
 * Manager: [![Build Status](https://drone-publish.longhorn.io/api/badges/longhorn/longhorn-manager/status.svg)](https://drone-publish.longhorn.io/longhorn/longhorn-manager)[![Go Report Card](https://goreportcard.com/badge/github.com/longhorn/longhorn-manager)](https://goreportcard.com/report/github.com/longhorn/longhorn-manager)[![FOSSA Status](https://app.fossa.com/api/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-manager.svg?type=shield)](https://app.fossa.com/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-manager?ref=badge_shield)
 * Instance Manager: [![Build Status](http://drone-publish.longhorn.io/api/badges/longhorn/longhorn-instance-manager/status.svg)](http://drone-publish.longhorn.io/longhorn/longhorn-instance-manager)[![Go Report Card](https://goreportcard.com/badge/github.com/longhorn/longhorn-instance-manager)](https://goreportcard.com/report/github.com/longhorn/longhorn-instance-manager)[![FOSSA Status](https://app.fossa.com/api/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-instance-manager.svg?type=shield)](https://app.fossa.com/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-instance-manager?ref=badge_shield)
 * Share Manager: [![Build Status](http://drone-publish.longhorn.io/api/badges/longhorn/longhorn-share-manager/status.svg)](http://drone-publish.longhorn.io/longhorn/longhorn-share-manager)[![Go Report Card](https://goreportcard.com/badge/github.com/longhorn/longhorn-share-manager)](https://goreportcard.com/report/github.com/longhorn/longhorn-share-manager)[![FOSSA Status](https://app.fossa.com/api/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-share-manager.svg?type=shield)](https://app.fossa.com/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-share-manager?ref=badge_shield)
 * Backing Image Manager: [![Build Status](http://drone-publish.longhorn.io/api/badges/longhorn/backing-image-manager/status.svg)](http://drone-publish.longhorn.io/longhorn/backing-image-manager)[![Go Report Card](https://goreportcard.com/badge/github.com/longhorn/backing-image-manager)](https://goreportcard.com/report/github.com/longhorn/backing-image-manager)[![FOSSA Status](https://app.fossa.com/api/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Fbacking-image-manager.svg?type=shield)](https://app.fossa.com/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Fbacking-image-manager?ref=badge_shield)
 * UI: [![Build Status](https://drone-publish.longhorn.io/api/badges/longhorn/longhorn-ui/status.svg)](https://drone-publish.longhorn.io/longhorn/longhorn-ui)[![FOSSA Status](https://app.fossa.com/api/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-ui.svg?type=shield)](https://app.fossa.com/projects/custom%2B25850%2Fgithub.com%2Flonghorn%2Flonghorn-ui?ref=badge_shield)
 * Test: [![Build Status](http://drone-publish.longhorn.io/api/badges/longhorn/longhorn-tests/status.svg)](http://drone-publish.longhorn.io/longhorn/longhorn-tests)
 ## Release Status
 | Release | Version | Type   |    
 |---------|---------|--------|
 | 1.3     | 1.3.0   | Latest |
 | 1.2     | 1.2.4   | Stable |
 | 1.1     | 1.1.3   | Stable |
 ## Get Involved
 ### Community Meeting and Office Hours
 Hosted by the core maintainers of Longhorn: 4th Friday of the every month at 09:00 (CET) or 16:00 (CST) at https://community.cncf.io/longhorn-community/.
 ### Longhorn Mailing List
 Stay up to date on the latest news and events: https://lists.cncf.io/g/cncf-longhorn
 You can read more about the community and its events here: https://github.com/longhorn/community
 ## Source code
 Longhorn is 100% open source software. Project source code is spread across a number of repos:
 | Component                      | What it does                                                           | GitHub repo                                                                                 |
 | :----------------------------- | :--------------------------------------------------------------------- | :------------------------------------------------------------------------------------------ |
@ -61,18 +70,21 @@ Longhorn is 100% open source software. Project source code is spread across a nu
 ![Longhorn UI](./longhorn-ui.png)
 # Get Started
 ## Requirements
 For the installation requirements, refer to the [Longhorn documentation.](https://longhorn.io/docs/latest/deploy/install/#installation-requirements)
 ## Installation
-> **NOTE**: Please note that the master branch is for the upcoming feature release development. 
+> **NOTE**: 
 > Please note that the master branch is for the upcoming feature release development. 
 > For an official release installation or upgrade, please refer to the below ways.
 Longhorn can be installed on a Kubernetes cluster in several ways:
- [Rancher catalog app](https://longhorn.io/docs/latest/deploy/install/install-with-rancher/)
+- [Rancher App Marketplace](https://longhorn.io/docs/latest/deploy/install/install-with-rancher/)
 - [kubectl](https://longhorn.io/docs/latest/deploy/install/install-with-kubectl/)
 - [Helm](https://longhorn.io/docs/latest/deploy/install/install-with-helm/)
@ -80,6 +92,24 @@ Longhorn can be installed on a Kubernetes cluster in several ways:
 The official Longhorn documentation is [here.](https://longhorn.io/docs)
 # Get Involved
 ## Discussion, Feedback
 If having any discussions or feedbacks, feel free to [file a discussion](https://github.com/longhorn/longhorn/discussions).
 ## Features Request, Bug Reporting
 If having any issues, feel free to [file an issue](https://github.com/longhorn/longhorn/issues/new/choose).
 We have a weekly community issue review meeting to review all reported issues or enhancement requests.
 When creating a bug issue, please help upload the support bundle to the issue or send to
 [longhorn-support-bundle](mailto:longhorn-support-bundle@suse.com).
 ## Report Vulnerabilities
 If having any vulnerabilities found, please report to [longhorn-security](mailto:longhorn-security@suse.com).
 # Community
 Longhorn is open source software, so contributions are greatly welcome.
@ -91,25 +121,17 @@ If you have any feedbacks, feel free to [file an issue](https://github.com/longh
 If having any discussion, feedbacks, requests, issues or security reports, please follow below ways.
 We also have a [CNCF Slack channel: longhorn](https://cloud-native.slack.com/messages/longhorn) for discussion.
-## Discussions or Feedbacks
+## Community Meeting and Office Hours
 Hosted by the core maintainers of Longhorn: 4th Friday of the every month at 09:00 (CET) or 16:00 (CST) at https://community.cncf.io/longhorn-community/.
-If having any discussions or feedbacks, feel free to [file a discussion](https://github.com/longhorn/longhorn/discussions).
+## Longhorn Mailing List
 Stay up to date on the latest news and events: https://lists.cncf.io/g/cncf-longhorn
-## Requests or Issues
+You can read more about the community and its events here: https://github.com/longhorn/community
 If having any issues, feel free to [file an issue](https://github.com/longhorn/longhorn/issues/new/choose).
 We have a weekly community issue review meeting to review all reported issues or enhancement requests.
 When creating a bug issue, please help upload the support bundle to the issue or send to
 [longhorn-support-bundle](mailto:longhorn-support-bundle@suse.com).
 ## Report Vulnerabilities
 If having any vulnerabilities found, please report to [longhorn-security](mailto:longhorn-security@suse.com).
 # License
-Copyright (c) 2014-2021 The Longhorn Authors
+Copyright (c) 2014-2022 The Longhorn Authors
 Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
--- a/chart/Chart.yaml
+++ b/chart/Chart.yaml
@ -1,8 +1,8 @@
 apiVersion: v1
 name: longhorn
-version: 1.3.3-rc3
+version: 1.6.0-dev
-appVersion: v1.3.3-rc3
+appVersion: v1.6.0-dev
-kubeVersion: ">=1.18.0-0 <1.25.0-0"
+kubeVersion: ">=1.21.0-0"
 description: Longhorn is a distributed block storage system for Kubernetes.
 keywords:
 - longhorn
--- a/chart/README.md
+++ b/chart/README.md
@ -18,10 +18,24 @@ Longhorn is 100% open source software. Project source code is spread across a nu
 ## Prerequisites
 1. A container runtime compatible with Kubernetes (Docker v1.13+, containerd v1.3.7+, etc.)
-2. Kubernetes >= v1.18 and <= v1.24
+2. Kubernetes >= v1.21
 3. Make sure `bash`, `curl`, `findmnt`, `grep`, `awk` and `blkid` has been installed in all nodes of the Kubernetes cluster.
 4. Make sure `open-iscsi` has been installed, and the `iscsid` daemon is running on all nodes of the Kubernetes cluster. For GKE, recommended Ubuntu as guest OS image since it contains `open-iscsi` already.
 ## Upgrading to Kubernetes v1.25+
 Starting in Kubernetes v1.25, [Pod Security Policies](https://kubernetes.io/docs/concepts/security/pod-security-policy/) have been removed from the Kubernetes API.
 As a result, **before upgrading to Kubernetes v1.25** (or on a fresh install in a Kubernetes v1.25+ cluster), users are expected to perform an in-place upgrade of this chart with `enablePSP` set to `false` if it has been previously set to `true`.
 > **Note:**
 > If you upgrade your cluster to Kubernetes v1.25+ before removing PSPs via a `helm upgrade` (even if you manually clean up resources), **it will leave the Helm release in a broken state within the cluster such that further Helm operations will not work (`helm uninstall`, `helm upgrade`, etc.).**
 >
 > If your charts get stuck in this state, you may have to clean up your Helm release secrets.
 Upon setting `enablePSP` to false, the chart will remove any PSP resources deployed on its behalf from the cluster. This is the default setting for this chart.
 As a replacement for PSPs, [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) should be used. Please consult the Longhorn docs for more details on how to configure your chart release namespaces to work with the new Pod Security Admission and apply Pod Security Standards.
 ## Installation
 1. Add Longhorn chart repository.
 ```
@ -37,7 +51,7 @@ helm repo update
 - With Helm 2, the following command will create the `longhorn-system` namespace and install the Longhorn chart together.
 ```
 helm install longhorn/longhorn --name longhorn --namespace longhorn-system
-``` 
+```
 - With Helm 3, the following commands will create the `longhorn-system` namespace first, then install the Longhorn chart.
 ```
@ -49,14 +63,264 @@ helm install longhorn longhorn/longhorn --namespace longhorn-system
 With Helm 2 to uninstall Longhorn.
 ```
 kubectl -n longhorn-system patch -p '{"value": "true"}' --type=merge lhs deleting-confirmation-flag
 helm delete longhorn --purge
 ```
 With Helm 3 to uninstall Longhorn.
 ```
 kubectl -n longhorn-system patch -p '{"value": "true"}' --type=merge lhs deleting-confirmation-flag
 helm uninstall longhorn -n longhorn-system
 kubectl delete namespace longhorn-system
 ```
 ## Values
 The `values.yaml` contains items used to tweak a deployment of this chart.
 ### Cattle Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | global.cattle.systemDefaultRegistry | string | `""` | System default registry |
 | global.cattle.windowsCluster.defaultSetting.systemManagedComponentsNodeSelector | string | `"kubernetes.io/os:linux"` | Node selector for Longhorn system managed components |
 | global.cattle.windowsCluster.defaultSetting.taintToleration | string | `"cattle.io/os=linux:NoSchedule"` | Toleration for Longhorn system managed components |
 | global.cattle.windowsCluster.enabled | bool | `false` | Enable this to allow Longhorn to run on the Rancher deployed Windows cluster |
 | global.cattle.windowsCluster.nodeSelector | object | `{"kubernetes.io/os":"linux"}` | Select Linux nodes to run Longhorn user deployed components |
 | global.cattle.windowsCluster.tolerations | list | `[{"effect":"NoSchedule","key":"cattle.io/os","operator":"Equal","value":"linux"}]` | Tolerate Linux nodes to run Longhorn user deployed components |
 ### Network Policies
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | networkPolicies.enabled | bool | `false` | Enable NetworkPolicies to limit access to the Longhorn pods |
 | networkPolicies.type | string | `"k3s"` | Create the policy based on your distribution to allow access for the ingress. Options: `k3s`, `rke2`, `rke1` |
 ### Image Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | image.csi.attacher.repository | string | `"longhornio/csi-attacher"` | Specify CSI attacher image repository. Leave blank to autodetect |
 | image.csi.attacher.tag | string | `"v4.2.0"` | Specify CSI attacher image tag. Leave blank to autodetect |
 | image.csi.livenessProbe.repository | string | `"longhornio/livenessprobe"` | Specify CSI liveness probe image repository. Leave blank to autodetect  |
 | image.csi.livenessProbe.tag | string | `"v2.9.0"` | Specify CSI liveness probe image tag. Leave blank to autodetect |
 | image.csi.nodeDriverRegistrar.repository | string | `"longhornio/csi-node-driver-registrar"` | Specify CSI node driver registrar image repository. Leave blank to autodetect |
 | image.csi.nodeDriverRegistrar.tag | string | `"v2.7.0"` | Specify CSI node driver registrar image tag. Leave blank to autodetect |
 | image.csi.provisioner.repository | string | `"longhornio/csi-provisioner"` | Specify CSI provisioner image repository. Leave blank to autodetect |
 | image.csi.provisioner.tag | string | `"v3.4.1"` | Specify CSI provisioner image tag. Leave blank to autodetect |
 | image.csi.resizer.repository | string | `"longhornio/csi-resizer"` | Specify CSI driver resizer image repository. Leave blank to autodetect |
 | image.csi.resizer.tag | string | `"v1.7.0"` | Specify CSI driver resizer image tag. Leave blank to autodetect |
 | image.csi.snapshotter.repository | string | `"longhornio/csi-snapshotter"` | Specify CSI driver snapshotter image repository. Leave blank to autodetect |
 | image.csi.snapshotter.tag | string | `"v6.2.1"` | Specify CSI driver snapshotter image tag. Leave blank to autodetect. |
 | image.longhorn.backingImageManager.repository | string | `"longhornio/backing-image-manager"` | Specify Longhorn backing image manager image repository |
 | image.longhorn.backingImageManager.tag | string | `"master-head"` | Specify Longhorn backing image manager image tag  |
 | image.longhorn.engine.repository | string | `"longhornio/longhorn-engine"` | Specify Longhorn engine image repository |
 | image.longhorn.engine.tag | string | `"master-head"` | Specify Longhorn engine image tag |
 | image.longhorn.instanceManager.repository | string | `"longhornio/longhorn-instance-manager"` | Specify Longhorn instance manager image repository |
 | image.longhorn.instanceManager.tag | string | `"master-head"` | Specify Longhorn instance manager image tag |
 | image.longhorn.manager.repository | string | `"longhornio/longhorn-manager"` | Specify Longhorn manager image repository |
 | image.longhorn.manager.tag | string | `"master-head"` | Specify Longhorn manager image tag |
 | image.longhorn.shareManager.repository | string | `"longhornio/longhorn-share-manager"` | Specify Longhorn share manager image repository |
 | image.longhorn.shareManager.tag | string | `"master-head"` | Specify Longhorn share manager image tag |
 | image.longhorn.supportBundleKit.repository | string | `"longhornio/support-bundle-kit"` | Specify Longhorn support bundle manager image repository |
 | image.longhorn.supportBundleKit.tag | string | `"v0.0.27"` | Specify Longhorn support bundle manager image tag |
 | image.longhorn.ui.repository | string | `"longhornio/longhorn-ui"` | Specify Longhorn ui image repository |
 | image.longhorn.ui.tag | string | `"master-head"` | Specify Longhorn ui image tag |
 | image.openshift.oauthProxy.repository | string | `"quay.io/openshift/origin-oauth-proxy"` | For openshift user. Specify oauth proxy image repository |
 | image.openshift.oauthProxy.tag | float | `4.13` | For openshift user. Specify oauth proxy image tag. Note: Use your OCP/OKD 4.X Version, Current Stable is 4.13 |
 | image.pullPolicy | string | `"IfNotPresent"` | Image pull policy which applies to all user deployed Longhorn Components. e.g, Longhorn manager, Longhorn driver, Longhorn UI |
 ### Service Settings
 | Key | Description |
 |-----|-------------|
 | service.manager.nodePort | NodePort port number (to set explicitly, choose port between 30000-32767) |
 | service.manager.type | Define Longhorn manager service type. |
 | service.ui.nodePort | NodePort port number (to set explicitly, choose port between 30000-32767) |
 | service.ui.type | Define Longhorn UI service type. Options: `ClusterIP`, `NodePort`, `LoadBalancer`, `Rancher-Proxy` |
 ### StorageClass Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | persistence.backingImage.dataSourceParameters | string | `nil` | Specify the data source parameters for the backing image used in Longhorn StorageClass. This option accepts a json string of a map. e.g., `'{\"url\":\"https://backing-image-example.s3-region.amazonaws.com/test-backing-image\"}'`. |
 | persistence.backingImage.dataSourceType | string | `nil` | Specify the data source type for the backing image used in Longhorn StorageClass. If the backing image does not exists, Longhorn will use this field to create a backing image. Otherwise, Longhorn will use it to verify the selected backing image. |
 | persistence.backingImage.enable | bool | `false` | Set backing image for Longhorn StorageClass |
 | persistence.backingImage.expectedChecksum | string | `nil` | Specify the expected SHA512 checksum of the selected backing image in Longhorn StorageClass |
 | persistence.backingImage.name | string | `nil` | Specify a backing image that will be used by Longhorn volumes in Longhorn StorageClass. If not exists, the backing image data source type and backing image data source parameters should be specified so that Longhorn will create the backing image before using it |
 | persistence.defaultClass | bool | `true` | Set Longhorn StorageClass as default |
 | persistence.defaultClassReplicaCount | int | `3` | Set replica count for Longhorn StorageClass |
 | persistence.defaultDataLocality | string | `"disabled"` | Set data locality for Longhorn StorageClass. Options: `disabled`, `best-effort` |
 | persistence.defaultFsType | string | `"ext4"` | Set filesystem type for Longhorn StorageClass |
 | persistence.defaultMkfsParams | string | `""` | Set mkfs options for Longhorn StorageClass |
 | persistence.defaultNodeSelector.enable | bool | `false` | Enable Node selector for Longhorn StorageClass |
 | persistence.defaultNodeSelector.selector | string | `""` | This selector enables only certain nodes having these tags to be used for the volume. e.g. `"storage,fast"` |
 | persistence.migratable | bool | `false` | Set volume migratable for Longhorn StorageClass |
 | persistence.reclaimPolicy | string | `"Delete"` | Define reclaim policy. Options: `Retain`, `Delete` |
 | persistence.recurringJobSelector.enable | bool | `false` | Enable recurring job selector for Longhorn StorageClass |
 | persistence.recurringJobSelector.jobList | list | `[]` | Recurring job selector list for Longhorn StorageClass. Please be careful of quotes of input. e.g., `[{"name":"backup", "isGroup":true}]` |
 | persistence.removeSnapshotsDuringFilesystemTrim | string | `"ignored"` | Allow automatically removing snapshots during filesystem trim for Longhorn StorageClass. Options: `ignored`, `enabled`, `disabled` |
 ### CSI Settings
 | Key | Description |
 |-----|-------------|
 | csi.attacherReplicaCount | Specify replica count of CSI Attacher. Leave blank to use default count: 3 |
 | csi.kubeletRootDir | Specify kubelet root-dir. Leave blank to autodetect |
 | csi.provisionerReplicaCount | Specify replica count of CSI Provisioner. Leave blank to use default count: 3 |
 | csi.resizerReplicaCount | Specify replica count of CSI Resizer. Leave blank to use default count: 3 |
 | csi.snapshotterReplicaCount | Specify replica count of CSI Snapshotter. Leave blank to use default count: 3 |
 ### Longhorn Manager Settings
 Longhorn system contains user deployed components (e.g, Longhorn manager, Longhorn driver, Longhorn UI) and system managed components (e.g, instance manager, engine image, CSI driver, etc.).
 These settings only apply to Longhorn manager component.
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | longhornManager.log.format | string | `"plain"` | Options: `plain`, `json` |
 | longhornManager.nodeSelector | object | `{}` | Select nodes to run Longhorn manager |
 | longhornManager.priorityClass | string | `nil` | Priority class for longhorn manager |
 | longhornManager.serviceAnnotations | object | `{}` | Annotation used in Longhorn manager service |
 | longhornManager.tolerations | list | `[]` | Tolerate nodes to run Longhorn manager |
 ### Longhorn Driver Settings
 Longhorn system contains user deployed components (e.g, Longhorn manager, Longhorn driver, Longhorn UI) and system managed components (e.g, instance manager, engine image, CSI driver, etc.).
 These settings only apply to Longhorn driver component.
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | longhornDriver.nodeSelector | object | `{}` | Select nodes to run Longhorn driver |
 | longhornDriver.priorityClass | string | `nil` | Priority class for longhorn driver |
 | longhornDriver.tolerations | list | `[]` | Tolerate nodes to run Longhorn driver |
 ### Longhorn UI Settings
 Longhorn system contains user deployed components (e.g, Longhorn manager, Longhorn driver, Longhorn UI) and system managed components (e.g, instance manager, engine image, CSI driver, etc.).
 These settings only apply to Longhorn UI component.
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | longhornUI.nodeSelector | object | `{}` | Select nodes to run Longhorn UI |
 | longhornUI.priorityClass | string | `nil` | Priority class count for longhorn ui |
 | longhornUI.replicas | int | `2` | Replica count for longhorn ui |
 | longhornUI.tolerations | list | `[]` | Tolerate nodes to run Longhorn UI |
 ### Ingress Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | ingress.annotations | string | `nil` | Ingress annotations done as key:value pairs |
 | ingress.enabled | bool | `false` | Set to true to enable ingress record generation |
 | ingress.host | string | `"sslip.io"` | Layer 7 Load Balancer hostname |
 | ingress.ingressClassName | string | `nil` | Add ingressClassName to the Ingress Can replace the kubernetes.io/ingress.class annotation on v1.18+ |
 | ingress.path | string | `"/"` | If ingress is enabled you can set the default ingress path then you can access the UI by using the following full path {{host}}+{{path}} |
 | ingress.secrets | string | `nil` | If you're providing your own certificates, please use this to add the certificates as secrets |
 | ingress.secureBackends | bool | `false` | Enable this in order to enable that the backend service will be connected at port 443 |
 | ingress.tls | bool | `false` | Set this to true in order to enable TLS on the ingress record |
 | ingress.tlsSecret | string | `"longhorn.local-tls"` | If TLS is set to true, you must declare what secret will store the key/certificate for TLS |
 ### Private Registry Settings
 Longhorn can be installed in an air gapped environment with private registry settings. Please refer to **Air Gap Installation** in our official site [link](https://longhorn.io/docs)
 | Key | Description |
 |-----|-------------|
 | privateRegistry.createSecret | Set `true` to create a new private registry secret |
 | privateRegistry.registryPasswd | Password used to authenticate to private registry |
 | privateRegistry.registrySecret | If create a new private registry secret is true, create a Kubernetes secret with this name; else use the existing secret of this name. Use it to pull images from your private registry |
 | privateRegistry.registryUrl | URL of private registry. Leave blank to apply system default registry |
 | privateRegistry.registryUser | User used to authenticate to private registry |
 ### OS/Kubernetes Distro Settings
 #### Opensift Settings
 Please also refer to this document [ocp-readme](https://github.com/longhorn/longhorn/blob/master/chart/ocp-readme.md) for more details
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | openshift.enabled | bool | `false` | Enable when using openshift |
 | openshift.ui.port | int | `443` | UI port in openshift environment |
 | openshift.ui.proxy | int | `8443` | UI proxy in openshift environment |
 | openshift.ui.route | string | `"longhorn-ui"` | UI route in openshift environment |
 ### Other Settings
 | Key | Default | Description |
 |-----|---------|-------------|
 | annotations | `{}` | Annotations to add to the Longhorn Manager DaemonSet Pods. Optional. |
 | enablePSP | `false` | For Kubernetes < v1.25, if your cluster enables Pod Security Policy admission controller, set this to `true` to ship longhorn-psp which allow privileged Longhorn pods to start |
 ### System Default Settings
 For system default settings, you can first leave blank to use default values which will be applied when installing Longhorn.
 You can then change them through UI after installation.
 For more details like types or options, you can refer to **Settings Reference** in our official site [link](https://longhorn.io/docs)
 | Key | Description |
 |-----|-------------|
 | defaultSettings.allowEmptyDiskSelectorVolume | Allow Scheduling Empty Disk Selector Volumes To Any Disk |
 | defaultSettings.allowEmptyNodeSelectorVolume | Allow Scheduling Empty Node Selector Volumes To Any Node |
 | defaultSettings.allowRecurringJobWhileVolumeDetached | If this setting is enabled, Longhorn will automatically attaches the volume and takes snapshot/backup  when it is the time to do recurring snapshot/backup. |
 | defaultSettings.allowVolumeCreationWithDegradedAvailability | This setting allows user to create and attach a volume that doesn't have all the replicas scheduled at the time of creation. |
 | defaultSettings.autoCleanupSystemGeneratedSnapshot | This setting enables Longhorn to automatically cleanup the system generated snapshot after replica rebuild is done. |
 | defaultSettings.autoDeletePodWhenVolumeDetachedUnexpectedly | If enabled, Longhorn will automatically delete the workload pod that is managed by a controller (e.g. deployment, statefulset, daemonset, etc...)  when Longhorn volume is detached unexpectedly (e.g. during Kubernetes upgrade, Docker reboot, or network disconnect). By deleting the pod, its controller restarts the pod and Kubernetes handles volume reattachment and remount. |
 | defaultSettings.autoSalvage | If enabled, volumes will be automatically salvaged when all the replicas become faulty e.g. due to network disconnection. Longhorn will try to figure out which replica(s) are usable, then use them for the volume. By default true. |
 | defaultSettings.backingImageCleanupWaitInterval | This interval in minutes determines how long Longhorn will wait before cleaning up the backing image file when there is no replica in the disk using it. |
 | defaultSettings.backingImageRecoveryWaitInterval | This interval in seconds determines how long Longhorn will wait before re-downloading the backing image file when all disk files of this backing image become failed or unknown. |
 | defaultSettings.backupCompressionMethod | This setting allows users to specify backup compression method. |
 | defaultSettings.backupConcurrentLimit | This setting controls how many worker threads per backup concurrently. |
 | defaultSettings.backupTarget | The endpoint used to access the backupstore. Available: NFS, CIFS, AWS, GCP, AZURE. |
 | defaultSettings.backupTargetCredentialSecret | The name of the Kubernetes secret associated with the backup target. |
 | defaultSettings.backupstorePollInterval | In seconds. The backupstore poll interval determines how often Longhorn checks the backupstore for new backups. Set to 0 to disable the polling. By default 300. |
 | defaultSettings.concurrentAutomaticEngineUpgradePerNodeLimit | This setting controls how Longhorn automatically upgrades volumes' engines to the new default engine image after upgrading Longhorn manager. The value of this setting specifies the maximum number of engines per node that are allowed to upgrade to the default engine image at the same time. If the value is 0, Longhorn will not automatically upgrade volumes' engines to default version. |
 | defaultSettings.concurrentReplicaRebuildPerNodeLimit | This setting controls how many replicas on a node can be rebuilt simultaneously. |
 | defaultSettings.concurrentVolumeBackupRestorePerNodeLimit | This setting controls how many volumes on a node can restore the backup concurrently. Set the value to **0** to disable backup restore. |
 | defaultSettings.createDefaultDiskLabeledNodes | Create default Disk automatically only on Nodes with the label "node.longhorn.io/create-default-disk=true" if no other disks exist. If disabled, the default disk will be created on all new nodes when each node is first added. |
 | defaultSettings.defaultDataLocality | Longhorn volume has data locality if there is a local replica of the volume on the same node as the pod which is using the volume. |
 | defaultSettings.defaultDataPath | Default path to use for storing data on a host. By default "/var/lib/longhorn/" |
 | defaultSettings.defaultLonghornStaticStorageClass | The 'storageClassName' is given to PVs and PVCs that are created for an existing Longhorn volume. The StorageClass name can also be used as a label, so it is possible to use a Longhorn StorageClass to bind a workload to an existing PV without creating a Kubernetes StorageClass object. By default 'longhorn-static'. |
 | defaultSettings.defaultReplicaCount | The default number of replicas when a volume is created from the Longhorn UI. For Kubernetes configuration, update the `numberOfReplicas` in the StorageClass. By default 3. |
 | defaultSettings.deletingConfirmationFlag | This flag is designed to prevent Longhorn from being accidentally uninstalled which will lead to data lost. |
 | defaultSettings.disableRevisionCounter | This setting is only for volumes created by UI. By default, this is false meaning there will be a reivision counter file to track every write to the volume. During salvage recovering Longhorn will pick the replica with largest reivision counter as candidate to recover the whole volume. If revision counter is disabled, Longhorn will not track every write to the volume. During the salvage recovering, Longhorn will use the 'volume-head-xxx.img' file last modification time and file size to pick the replica candidate to recover the whole volume. |
 | defaultSettings.disableSchedulingOnCordonedNode | Disable Longhorn manager to schedule replica on Kubernetes cordoned node. By default true. |
 | defaultSettings.engineReplicaTimeout | In seconds. The setting specifies the timeout between the engine and replica(s), and the value should be between 8 to 30 seconds. The default value is 8 seconds. |
 | defaultSettings.failedBackupTTL | In minutes. This setting determines how long Longhorn will keep the backup resource that was failed. Set to 0 to disable the auto-deletion. |
 | defaultSettings.fastReplicaRebuildEnabled | This feature supports the fast replica rebuilding. It relies on the checksum of snapshot disk files, so setting the snapshot-data-integrity to **enable** or **fast-check** is a prerequisite. |
 | defaultSettings.guaranteedInstanceManagerCPU | This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each instance manager Pod. You can leave it with the default value, which is 12%. |
 | defaultSettings.kubernetesClusterAutoscalerEnabled | Enabling this setting will notify Longhorn that the cluster is using Kubernetes Cluster Autoscaler. |
 | defaultSettings.logLevel | The log level Panic, Fatal, Error, Warn, Info, Debug, Trace used in longhorn manager. Default to Info. |
 | defaultSettings.nodeDownPodDeletionPolicy | Defines the Longhorn action when a Volume is stuck with a StatefulSet/Deployment Pod on a node that is down. |
 | defaultSettings.nodeDrainPolicy | Define the policy to use when a node with the last healthy replica of a volume is drained. |
 | defaultSettings.offlineReplicaRebuilding | This setting allows users to enable the offline replica rebuilding for volumes using v2 data engine. |
 | defaultSettings.orphanAutoDeletion | This setting allows Longhorn to delete the orphan resource and its corresponding orphaned data automatically like stale replicas. Orphan resources on down or unknown nodes will not be cleaned up automatically. |
 | defaultSettings.priorityClass | priorityClass for longhorn system componentss |
 | defaultSettings.recurringFailedJobsHistoryLimit | This setting specifies how many failed backup or snapshot job histories should be retained. History will not be retained if the value is 0. |
 | defaultSettings.recurringSuccessfulJobsHistoryLimit | This setting specifies how many successful backup or snapshot job histories should be retained. History will not be retained if the value is 0. |
 | defaultSettings.removeSnapshotsDuringFilesystemTrim | This setting allows Longhorn filesystem trim feature to automatically mark the latest snapshot and its ancestors as removed and stops at the snapshot containing multiple children. |
 | defaultSettings.replicaAutoBalance | Enable this setting automatically rebalances replicas when discovered an available node. |
 | defaultSettings.replicaDiskSoftAntiAffinity | Allow scheduling on disks with existing healthy replicas of the same volume. By default true. |
 | defaultSettings.replicaFileSyncHttpClientTimeout | In seconds. The setting specifies the HTTP client timeout to the file sync server. |
 | defaultSettings.replicaReplenishmentWaitInterval | In seconds. The interval determines how long Longhorn will wait at least in order to reuse the existing data on a failed replica rather than directly creating a new replica for a degraded volume. |
 | defaultSettings.replicaSoftAntiAffinity | Allow scheduling on nodes with existing healthy replicas of the same volume. By default false. |
 | defaultSettings.replicaZoneSoftAntiAffinity | Allow scheduling new Replicas of Volume to the Nodes in the same Zone as existing healthy Replicas. Nodes don't belong to any Zone will be treated as in the same Zone. Notice that Longhorn relies on label `topology.kubernetes.io/zone=<Zone name of the node>` in the Kubernetes node object to identify the zone. By default true. |
 | defaultSettings.restoreConcurrentLimit | This setting controls how many worker threads per restore concurrently. |
 | defaultSettings.restoreVolumeRecurringJobs | Restore recurring jobs from the backup volume on the backup target and create recurring jobs if not exist during a backup restoration. |
 | defaultSettings.snapshotDataIntegrity | This setting allows users to enable or disable snapshot hashing and data integrity checking. |
 | defaultSettings.snapshotDataIntegrityCronjob | Unix-cron string format. The setting specifies when Longhorn checks the data integrity of snapshot disk files. |
 | defaultSettings.snapshotDataIntegrityImmediateCheckAfterSnapshotCreation | Hashing snapshot disk files impacts the performance of the system. The immediate snapshot hashing and checking can be disabled to minimize the impact after creating a snapshot. |
 | defaultSettings.storageMinimalAvailablePercentage | If the minimum available disk capacity exceeds the actual percentage of available disk capacity, the disk becomes unschedulable until more space is freed up. By default 25. |
 | defaultSettings.storageNetwork | Longhorn uses the storage network for in-cluster data traffic. Leave this blank to use the Kubernetes cluster network. |
 | defaultSettings.storageOverProvisioningPercentage | The over-provisioning percentage defines how much storage can be allocated relative to the hard drive's capacity. By default 200. |
 | defaultSettings.storageReservedPercentageForDefaultDisk | The reserved percentage specifies the percentage of disk space that will not be allocated to the default disk on each new Longhorn node. |
 | defaultSettings.supportBundleFailedHistoryLimit | This setting specifies how many failed support bundles can exist in the cluster. Set this value to **0** to have Longhorn automatically purge all failed support bundles. |
 | defaultSettings.systemManagedComponentsNodeSelector | nodeSelector for longhorn system components |
 | defaultSettings.systemManagedPodsImagePullPolicy | This setting defines the Image Pull Policy of Longhorn system managed pod. e.g. instance manager, engine image, CSI driver, etc. The new Image Pull Policy will only apply after the system managed pods restart. |
 | defaultSettings.taintToleration | taintToleration for longhorn system components |
 | defaultSettings.upgradeChecker | Upgrade Checker will check for new Longhorn version periodically. When there is a new version available, a notification will appear in the UI. By default true. |
 | defaultSettings.v2DataEngine | This allows users to activate v2 data engine based on SPDK. Currently, it is in the preview phase and should not be utilized in a production environment. |
 ---
 Please see [link](https://github.com/longhorn/longhorn) for more information.
--- a/chart/README.md.gotmpl
+++ b/chart/README.md.gotmpl
@ -0,0 +1,253 @@
 # Longhorn Chart
 > **Important**: Please install the Longhorn chart in the `longhorn-system` namespace only.
 > **Warning**: Longhorn doesn't support downgrading from a higher version to a lower version.
 ## Source Code
 Longhorn is 100% open source software. Project source code is spread across a number of repos:
 1. Longhorn Engine -- Core controller/replica logic https://github.com/longhorn/longhorn-engine
 2. Longhorn Instance Manager -- Controller/replica instance lifecycle management https://github.com/longhorn/longhorn-instance-manager
 3. Longhorn Share Manager -- NFS provisioner that exposes Longhorn volumes as ReadWriteMany volumes. https://github.com/longhorn/longhorn-share-manager
 4. Backing Image Manager -- Backing image file lifecycle management. https://github.com/longhorn/backing-image-manager
 5. Longhorn Manager -- Longhorn orchestration, includes CSI driver for Kubernetes https://github.com/longhorn/longhorn-manager
 6. Longhorn UI -- Dashboard https://github.com/longhorn/longhorn-ui
 ## Prerequisites
 1. A container runtime compatible with Kubernetes (Docker v1.13+, containerd v1.3.7+, etc.)
 2. Kubernetes >= v1.21
 3. Make sure `bash`, `curl`, `findmnt`, `grep`, `awk` and `blkid` has been installed in all nodes of the Kubernetes cluster.
 4. Make sure `open-iscsi` has been installed, and the `iscsid` daemon is running on all nodes of the Kubernetes cluster. For GKE, recommended Ubuntu as guest OS image since it contains `open-iscsi` already.
 ## Upgrading to Kubernetes v1.25+
 Starting in Kubernetes v1.25, [Pod Security Policies](https://kubernetes.io/docs/concepts/security/pod-security-policy/) have been removed from the Kubernetes API.
 As a result, **before upgrading to Kubernetes v1.25** (or on a fresh install in a Kubernetes v1.25+ cluster), users are expected to perform an in-place upgrade of this chart with `enablePSP` set to `false` if it has been previously set to `true`.
 > **Note:**
 > If you upgrade your cluster to Kubernetes v1.25+ before removing PSPs via a `helm upgrade` (even if you manually clean up resources), **it will leave the Helm release in a broken state within the cluster such that further Helm operations will not work (`helm uninstall`, `helm upgrade`, etc.).**
 >
 > If your charts get stuck in this state, you may have to clean up your Helm release secrets.
 Upon setting `enablePSP` to false, the chart will remove any PSP resources deployed on its behalf from the cluster. This is the default setting for this chart.
 As a replacement for PSPs, [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) should be used. Please consult the Longhorn docs for more details on how to configure your chart release namespaces to work with the new Pod Security Admission and apply Pod Security Standards.
 ## Installation
 1. Add Longhorn chart repository.
 ```
 helm repo add longhorn https://charts.longhorn.io
 ```
 2. Update local Longhorn chart information from chart repository.
 ```
 helm repo update
 ```
 3. Install Longhorn chart.
 - With Helm 2, the following command will create the `longhorn-system` namespace and install the Longhorn chart together.
 ```
 helm install longhorn/longhorn --name longhorn --namespace longhorn-system
 ```
 - With Helm 3, the following commands will create the `longhorn-system` namespace first, then install the Longhorn chart.
 ```
 kubectl create namespace longhorn-system
 helm install longhorn longhorn/longhorn --namespace longhorn-system
 ```
 ## Uninstallation
 With Helm 2 to uninstall Longhorn.
 ```
 kubectl -n longhorn-system patch -p '{"value": "true"}' --type=merge lhs deleting-confirmation-flag
 helm delete longhorn --purge
 ```
 With Helm 3 to uninstall Longhorn.
 ```
 kubectl -n longhorn-system patch -p '{"value": "true"}' --type=merge lhs deleting-confirmation-flag
 helm uninstall longhorn -n longhorn-system
 kubectl delete namespace longhorn-system
 ```
 ## Values
 The `values.yaml` contains items used to tweak a deployment of this chart.
 ### Cattle Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "global" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Network Policies
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "networkPolicies" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Image Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "image" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Service Settings
 | Key | Description |
 |-----|-------------|
 {{- range .Values }}
  {{- if (and (hasPrefix "service" .Key) (not (contains "Account" .Key))) }}
 | {{ .Key }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### StorageClass Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "persistence" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### CSI Settings
 | Key | Description |
 |-----|-------------|
 {{- range .Values }}
  {{- if hasPrefix "csi" .Key }}
 | {{ .Key }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Longhorn Manager Settings
 Longhorn system contains user deployed components (e.g, Longhorn manager, Longhorn driver, Longhorn UI) and system managed components (e.g, instance manager, engine image, CSI driver, etc.).
 These settings only apply to Longhorn manager component.
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "longhornManager" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Longhorn Driver Settings
 Longhorn system contains user deployed components (e.g, Longhorn manager, Longhorn driver, Longhorn UI) and system managed components (e.g, instance manager, engine image, CSI driver, etc.).
 These settings only apply to Longhorn driver component.
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "longhornDriver" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Longhorn UI Settings
 Longhorn system contains user deployed components (e.g, Longhorn manager, Longhorn driver, Longhorn UI) and system managed components (e.g, instance manager, engine image, CSI driver, etc.).
 These settings only apply to Longhorn UI component.
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "longhornUI" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Ingress Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "ingress" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Private Registry Settings
 Longhorn can be installed in an air gapped environment with private registry settings. Please refer to **Air Gap Installation** in our official site [link](https://longhorn.io/docs)
 | Key | Description |
 |-----|-------------|
 {{- range .Values }}
  {{- if hasPrefix "privateRegistry" .Key }}
 | {{ .Key }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### OS/Kubernetes Distro Settings
 #### Opensift Settings
 Please also refer to this document [ocp-readme](https://github.com/longhorn/longhorn/blob/master/chart/ocp-readme.md) for more details
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 {{- range .Values }}
  {{- if hasPrefix "openshift" .Key }}
 | {{ .Key }} | {{ .Type }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### Other Settings
 | Key | Default | Description |
 |-----|---------|-------------|
 {{- range .Values }}
  {{- if not (or (hasPrefix "defaultSettings" .Key)
  (hasPrefix "networkPolicies" .Key)
  (hasPrefix "image" .Key)
  (hasPrefix "service" .Key)
  (hasPrefix "persistence" .Key)
  (hasPrefix "csi" .Key)
  (hasPrefix "longhornManager" .Key)
  (hasPrefix "longhornDriver" .Key)
  (hasPrefix "longhornUI" .Key)
  (hasPrefix "privateRegistry" .Key)
  (hasPrefix "ingress" .Key)
  (hasPrefix "openshift" .Key)
  (hasPrefix "global" .Key)) }}
 | {{ .Key }} | {{ if .Default }}{{ .Default }}{{ else }}{{ .AutoDefault }}{{ end }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ### System Default Settings
 For system default settings, you can first leave blank to use default values which will be applied when installing Longhorn.
 You can then change them through UI after installation.
 For more details like types or options, you can refer to **Settings Reference** in our official site [link](https://longhorn.io/docs)
 | Key | Description |
 |-----|-------------|
 {{- range .Values }}
  {{- if hasPrefix "defaultSettings" .Key }}
 | {{ .Key }} | {{ if .Description }}{{ .Description }}{{ else }}{{ .AutoDescription }}{{ end }} |
  {{- end }}
 {{- end }}
 ---
 Please see [link](https://github.com/longhorn/longhorn) for more information.
--- a/chart/ocp-readme.md
+++ b/chart/ocp-readme.md
@ -0,0 +1,177 @@
 # OpenShift / OKD Extra Configuration Steps
 - [OpenShift / OKD Extra Configuration Steps](#openshift--okd-extra-configuration-steps)
  - [Notes](#notes)
  - [Known Issues](#known-issues)
  - [Preparing Nodes (Optional)](#preparing-nodes-optional)
    - [Default /var/lib/longhorn setup](#default-varliblonghorn-setup)
    - [Separate /var/mnt/longhorn setup](#separate-varmntlonghorn-setup)
      - [Create Filesystem](#create-filesystem)
      - [Mounting Disk On Boot](#mounting-disk-on-boot)
      - [Label and Annotate Nodes](#label-and-annotate-nodes)
  - [Example values.yaml](#example-valuesyaml)
  - [Installation](#installation)
  - [Refs](#refs)
 ## Notes
 Main changes and tasks for OCP are:
 - On OCP / OKD, the Operating System is Managed by the Cluster
 - OCP Imposes [Security Context Constraints](https://docs.openshift.com/container-platform/4.11/authentication/managing-security-context-constraints.html)
  - This requires everything to run with the least privilege possible. For the moment every component has been given access to run as higher privilege.
  - Something to circle back on is network polices and which components can have their privileges reduced without impacting functionality.
    - The UI probably can be for example.
 - openshift/oauth-proxy for authentication to the Longhorn Ui
  - **⚠️** Currently Scoped to Authenticated Users that can delete a longhorn settings object.
    - **⚠️** Since the UI it self is not protected, network policies will need to be created to prevent namespace <--> namespace communication against the pod or service object directly.
    - Anyone with access to the UI Deployment can remove the route restriction. (Namespace Scoped Admin)
 - Option to use separate disk in /var/mnt/longhorn & MachineConfig file to mount /var/mnt/longhorn
 - Adding finalizers for mount propagation
 ## Known Issues
 - General Feature/Issue Thread
  - [[FEATURE] Deploying Longhorn on OKD/Openshift](https://github.com/longhorn/longhorn/issues/1831)
 - 4.10 / 1.23:
  - 4.10.0-0.okd-2022-03-07-131213 to 4.10.0-0.okd-2022-07-09-073606
    - Tested, No Known Issues
 - 4.11 / 1.24:
  - 4.11.0-0.okd-2022-07-27-052000 to 4.11.0-0.okd-2022-11-19-050030
    - Tested, No Known Issues
  - 4.11.0-0.okd-2022-12-02-145640, 4.11.0-0.okd-2023-01-14-152430:
    - Workaround: [[BUG] Volumes Stuck in Attach/Detach Loop](https://github.com/longhorn/longhorn/issues/4988)
      - [MachineConfig Patch](https://github.com/longhorn/longhorn/issues/4988#issuecomment-1345676772)
 - 4.12 / 1.25:
  - 4.12.0-0.okd-2022-12-05-210624 to 4.12.0-0.okd-2023-01-20-101927
    - Tested, No Known Issues
  - 4.12.0-0.okd-2023-01-21-055900 to 4.12.0-0.okd-2023-02-18-033438:
    - Workaround: [[BUG] Volumes Stuck in Attach/Detach Loop](https://github.com/longhorn/longhorn/issues/4988)
      - [MachineConfig Patch](https://github.com/longhorn/longhorn/issues/4988#issuecomment-1345676772)
  - 4.12.0-0.okd-2023-03-05-022504 - 4.12.0-0.okd-2023-04-16-041331:
    - Tested, No Known Issues
 - 4.13 / 1.26:
  - 4.13.0-0.okd-2023-05-03-001308 - 4.13.0-0.okd-2023-08-18-135805:
    - Tested, No Known Issues
 - 4.14 / 1.27:
  - 4.14.0-0.okd-2023-08-12-022330 - 4.14.0-0.okd-2023-10-28-073550:
    - Tested, No Known Issues
 ## Preparing Nodes (Optional)
 Only required if you require additional customizations, such as storage-less nodes, or secondary disks.
 ### Default /var/lib/longhorn setup
 Label each node for storage with:
 ```bash
 oc get nodes --no-headers | awk '{print $1}'
 export NODE="worker-0"
 oc label node "${NODE}" node.longhorn.io/create-default-disk=true
 ```
 ### Separate /var/mnt/longhorn setup
 #### Create Filesystem
 On the storage nodes create a filesystem with the label longhorn:
 ```bash
 oc get nodes --no-headers | awk '{print $1}'
 export NODE="worker-0"
 oc debug node/${NODE} -t -- chroot /host bash
 # Validate Target Drive is Present
 lsblk
 export DRIVE="sdb" #vdb
 sudo mkfs.ext4 -L longhorn /dev/${DRIVE}
 ```
 > ⚠️ Note: If you add New Nodes After the below Machine Config is applied, you will need to also reboot the node.
 #### Mounting Disk On Boot
 The Secondary Drive needs to be mounted on every boot. Save the Concents and Apply the MachineConfig with `oc apply -f`:
 > ⚠️ This will trigger an machine config profile update and reboot all worker nodes on the cluster
 ```yaml
 apiVersion: machineconfiguration.openshift.io/v1
 kind: MachineConfig
 metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 71-mount-storage-worker
 spec:
  config:
    ignition:
      version: 3.2.0
    systemd:
      units:
        - name: var-mnt-longhorn.mount
          enabled: true
          contents: |
            [Unit]
            Before=local-fs.target
            [Mount]
            Where=/var/mnt/longhorn
            What=/dev/disk/by-label/longhorn
            Options=rw,relatime,discard
            [Install]
            WantedBy=local-fs.target
 ```
 #### Label and Annotate Nodes
 Label and annotate storage nodes like this:
 ```bash
 oc get nodes --no-headers | awk '{print $1}'
 export NODE="worker-0"
 oc annotate node ${NODE} --overwrite node.longhorn.io/default-disks-config='[{"path":"/var/mnt/longhorn","allowScheduling":true}]'
 oc label node ${NODE} node.longhorn.io/create-default-disk=config
 ```
 ## Example values.yaml
 Minimum Adjustments Required
 ```yaml
 openshift:
  oauthProxy:
    repository: quay.io/openshift/origin-oauth-proxy
    tag: 4.14  # Use Your OCP/OKD 4.X Version, Current Stable is 4.14
 # defaultSettings: # Preparing nodes (Optional)
  # createDefaultDiskLabeledNodes: true
 openshift:
  enabled: true
  ui:
    route: "longhorn-ui"
    port: 443
    proxy: 8443
 ```
 ## Installation
 ```bash
 # helm template ./chart/ --namespace longhorn-system --values ./chart/values.yaml --no-hooks > longhorn.yaml # Local Testing
 helm template longhorn --namespace longhorn-system --values values.yaml --no-hooks  > longhorn.yaml
 oc create namespace longhorn-system -o yaml --dry-run=client | oc apply -f -
 oc apply -f longhorn.yaml -n longhorn-system
 ```
 ## Refs
 - <https://docs.openshift.com/container-platform/4.11/storage/persistent_storage/persistent-storage-iscsi.html>
 - <https://docs.okd.io/4.11/storage/persistent_storage/persistent-storage-iscsi.html>
 - okd 4.5: <https://github.com/longhorn/longhorn/issues/1831#issuecomment-702690613>
 - okd 4.6: <https://github.com/longhorn/longhorn/issues/1831#issuecomment-765884631>
 - oauth-proxy: <https://github.com/openshift/oauth-proxy/blob/master/contrib/sidecar.yaml>
 - <https://github.com/longhorn/longhorn/issues/1831>
--- a/chart/questions.yaml
+++ b/chart/questions.yaml
@ -17,7 +17,7 @@ questions:
    label: Longhorn Manager Image Repository
    group: "Longhorn Images Settings"
  - variable: image.longhorn.manager.tag
-    default: v1.3.3-rc3
+    default: master-head
    description: "Specify Longhorn Manager Image Tag"
    type: string
    label: Longhorn Manager Image Tag
@ -29,7 +29,7 @@ questions:
    label: Longhorn Engine Image Repository
    group: "Longhorn Images Settings"
  - variable: image.longhorn.engine.tag
-    default: v1.3.3-rc3
+    default: master-head
    description: "Specify Longhorn Engine Image Tag"
    type: string
    label: Longhorn Engine Image Tag
@ -41,7 +41,7 @@ questions:
    label: Longhorn UI Image Repository
    group: "Longhorn Images Settings"
  - variable: image.longhorn.ui.tag
-    default: v1.3.3-rc3
+    default: master-head
    description: "Specify Longhorn UI Image Tag"
    type: string
    label: Longhorn UI Image Tag
@ -53,7 +53,7 @@ questions:
    label: Longhorn Instance Manager Image Repository
    group: "Longhorn Images Settings"
  - variable: image.longhorn.instanceManager.tag
-    default: v1_20230407
+    default: v2_20221123
    description: "Specify Longhorn Instance Manager Image Tag"
    type: string
    label: Longhorn Instance Manager Image Tag
@ -65,7 +65,7 @@ questions:
    label: Longhorn Share Manager Image Repository
    group: "Longhorn Images Settings"
  - variable: image.longhorn.shareManager.tag
-    default: v1_20230320
+    default: v1_20220914
    description: "Specify Longhorn Share Manager Image Tag"
    type: string
    label: Longhorn Share Manager Image Tag
@ -77,11 +77,23 @@ questions:
    label: Longhorn Backing Image Manager Image Repository
    group: "Longhorn Images Settings"
  - variable: image.longhorn.backingImageManager.tag
-    default: v3_20230320
+    default: v3_20220808
    description: "Specify Longhorn Backing Image Manager Image Tag"
    type: string
    label: Longhorn Backing Image Manager Image Tag
    group: "Longhorn Images Settings"
  - variable: image.longhorn.supportBundleKit.repository
    default: longhornio/support-bundle-kit
    description: "Specify Longhorn Support Bundle Manager Image Repository"
    type: string
    label: Longhorn Support Bundle Kit Image Repository
    group: "Longhorn Images Settings"
  - variable: image.longhorn.supportBundleKit.tag
    default: v0.0.27
    description: "Specify Longhorn Support Bundle Manager Image Tag"
    type: string
    label: Longhorn Support Bundle Kit Image Tag
    group: "Longhorn Images Settings"
  - variable: image.csi.attacher.repository
    default: longhornio/csi-attacher
    description: "Specify CSI attacher image repository. Leave blank to autodetect."
@ -89,7 +101,7 @@ questions:
    label: Longhorn CSI Attacher Image Repository
    group: "Longhorn CSI Driver Images"
  - variable: image.csi.attacher.tag
-    default: v3.4.0
+    default: v4.2.0
    description: "Specify CSI attacher image tag. Leave blank to autodetect."
    type: string
    label: Longhorn CSI Attacher Image Tag
@ -101,7 +113,7 @@ questions:
    label: Longhorn CSI Provisioner Image Repository
    group: "Longhorn CSI Driver Images"
  - variable: image.csi.provisioner.tag
-    default: v2.1.2
+    default: v3.4.1
    description: "Specify CSI provisioner image tag. Leave blank to autodetect."
    type: string
    label: Longhorn CSI Provisioner Image Tag
@ -113,7 +125,7 @@ questions:
    label: Longhorn CSI Node Driver Registrar Image Repository
    group: "Longhorn CSI Driver Images"
  - variable: image.csi.nodeDriverRegistrar.tag
-    default: v2.5.0
+    default: v2.7.0
    description: "Specify CSI Node Driver Registrar image tag. Leave blank to autodetect."
    type: string
    label: Longhorn CSI Node Driver Registrar Image Tag
@ -125,7 +137,7 @@ questions:
    label: Longhorn CSI Driver Resizer Image Repository
    group: "Longhorn CSI Driver Images"
  - variable: image.csi.resizer.tag
-    default: v1.2.0
+    default: v1.7.0
    description: "Specify CSI Driver Resizer image tag. Leave blank to autodetect."
    type: string
    label: Longhorn CSI Driver Resizer Image Tag
@ -137,7 +149,7 @@ questions:
    label: Longhorn CSI Driver Snapshotter Image Repository
    group: "Longhorn CSI Driver Images"
  - variable: image.csi.snapshotter.tag
-    default: v3.0.3
+    default: v6.2.1
    description: "Specify CSI Driver Snapshotter image tag. Leave blank to autodetect."
    type: string
    label: Longhorn CSI Driver Snapshotter Image Tag
@ -147,9 +159,9 @@ questions:
    description: "Specify CSI liveness probe image repository. Leave blank to autodetect."
    type: string
    label: Longhorn CSI Liveness Probe Image Repository
-    group: "Longhorn CSI Liveness Probe Images"
+    group: "Longhorn CSI Driver Images"
  - variable: image.csi.livenessProbe.tag
-    default: v2.8.0
+    default: v2.9.0
    description: "Specify CSI liveness probe image tag. Leave blank to autodetect."
    type: string
    label: Longhorn CSI Liveness Probe Image Tag
@ -232,7 +244,7 @@ questions:
    group: "Longhorn CSI Driver Settings"
  - variable: defaultSettings.backupTarget
    label: Backup Target
-    description: "The endpoint used to access the backupstore. NFS and S3 are supported."
+    description: "The endpoint used to access the backupstore. Available: NFS, CIFS, AWS, GCP, AZURE"
    group: "Longhorn Default Settings"
    type: string
    default:
@ -244,8 +256,7 @@ questions:
    default:
  - variable: defaultSettings.allowRecurringJobWhileVolumeDetached
    label: Allow Recurring Job While Volume Is Detached
-    description: 'If this setting is enabled, Longhorn will automatically attaches the volume and takes snapshot/backup when it is the time to do recurring snapshot/backup.
+    description: 'If this setting is enabled, Longhorn will automatically attaches the volume and takes snapshot/backup when it is the time to do recurring snapshot/backup.'
 Note that the volume is not ready for workload during the period when the volume was automatically attached. Workload will have to wait until the recurring job finishes.'
    group: "Longhorn Default Settings"
    type: boolean
    default: "false"
@ -263,11 +274,7 @@ Note that the volume is not ready for workload during the period when the volume
    default: "/var/lib/longhorn/"
  - variable: defaultSettings.defaultDataLocality
    label: Default Data Locality
-    description: 'We say a Longhorn volume has data locality if there is a local replica of the volume on the same node as the pod which is using the volume.
+    description: 'Longhorn volume has data locality if there is a local replica of the volume on the same node as the pod which is using the volume.'
 This setting specifies the default data locality when a volume is created from the Longhorn UI. For Kubernetes configuration, update the `dataLocality` in the StorageClass
 The available modes are:
 - **disabled**. This is the default option. There may or may not be a replica on the same node as the attached volume (workload)
 - **best-effort**. This option instructs Longhorn to try to keep a replica on the same node as the attached volume (workload). Longhorn will not stop the volume, even if it cannot keep a replica local to the attached volume (workload) due to environment limitation, e.g. not enough disk space, incompatible disk tags, etc.'
    group: "Longhorn Default Settings"
    type: enum
    options:
@ -282,17 +289,7 @@ The available modes are:
    default: "false"
  - variable: defaultSettings.replicaAutoBalance
    label: Replica Auto Balance
-    description: 'Enable this setting automatically rebalances replicas when discovered an available node.
+    description: 'Enable this setting automatically rebalances replicas when discovered an available node.'
 The available global options are:
 - **disabled**. This is the default option. No replica auto-balance will be done.
 - **least-effort**. This option instructs Longhorn to balance replicas for minimal redundancy.
 - **best-effort**. This option instructs Longhorn to balance replicas for even redundancy.
 Longhorn also support individual volume setting. The setting can be specified in volume.spec.replicaAutoBalance, this overrules the global setting.
 The available volume spec options are:
 - **ignored**. This is the default option that instructs Longhorn to inherit from the global setting.
 - **disabled**. This option instructs Longhorn no replica auto-balance should be done.
 - **least-effort**. This option instructs Longhorn to balance replicas for minimal redundancy.
 - **best-effort**. This option instructs Longhorn to balance replicas for even redundancy.'
    group: "Longhorn Default Settings"
    type: enum
    options:
@ -315,6 +312,14 @@ The available volume spec options are:
    min: 0
    max: 100
    default: 25
  - variable: defaultSettings.storageReservedPercentageForDefaultDisk
    label: Storage Reserved Percentage For Default Disk
    description: "The reserved percentage specifies the percentage of disk space that will not be allocated to the default disk on each new Longhorn node."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    max: 100
    default: 30
  - variable: defaultSettings.upgradeChecker
    label: Enable Upgrade Checker
    description: 'Upgrade Checker will check for new Longhorn version periodically. When there is a new version available, a notification will appear in the UI. By default true.'
@ -344,14 +349,38 @@ The available volume spec options are:
    default: 300
  - variable: defaultSettings.failedBackupTTL
    label: Failed Backup Time to Live
-    description: "In minutes. This setting determines how long Longhorn will keep the backup resource that was failed. Set to 0 to disable the auto-deletion.
+    description: "In minutes. This setting determines how long Longhorn will keep the backup resource that was failed. Set to 0 to disable the auto-deletion."
 Failed backups will be checked and cleaned up during backupstore polling which is controlled by **Backupstore Poll Interval** setting.
 Hence this value determines the minimal wait interval of the cleanup. And the actual cleanup interval is multiple of **Backupstore Poll Interval**.
 Disabling **Backupstore Poll Interval** also means to disable failed backup auto-deletion."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    default: 1440
  - variable: defaultSettings.restoreVolumeRecurringJobs
    label: Restore Volume Recurring Jobs
    description: "Restore recurring jobs from the backup volume on the backup target and create recurring jobs if not exist during a backup restoration."
    group: "Longhorn Default Settings"
    type: boolean
    default: "false"
  - variable: defaultSettings.recurringSuccessfulJobsHistoryLimit
    label: Cronjob Successful Jobs History Limit
    description: "This setting specifies how many successful backup or snapshot job histories should be retained. History will not be retained if the value is 0."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    default: 1
  - variable: defaultSettings.recurringFailedJobsHistoryLimit
    label: Cronjob Failed Jobs History Limit
    description: "This setting specifies how many failed backup or snapshot job histories should be retained. History will not be retained if the value is 0."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    default: 1
  - variable: defaultSettings.supportBundleFailedHistoryLimit
    label: SupportBundle Failed History Limit
    description: "This setting specifies how many failed support bundles can exist in the cluster. Set this value to **0** to have Longhorn automatically purge all failed support bundles."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    default: 1
  - variable: defaultSettings.autoSalvage
    label: Automatic salvage
    description: "If enabled, volumes will be automatically salvaged when all the replicas become faulty e.g. due to network disconnection. Longhorn will try to figure out which replica(s) are usable, then use them for the volume. By default true."
@ -360,9 +389,7 @@ Disabling **Backupstore Poll Interval** also means to disable failed backup auto
    default: "true"
  - variable: defaultSettings.autoDeletePodWhenVolumeDetachedUnexpectedly
    label: Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly
-    description: 'If enabled, Longhorn will automatically delete the workload pod that is managed by a controller (e.g. deployment, statefulset, daemonset, etc...) when Longhorn volume is detached unexpectedly (e.g. during Kubernetes upgrade, Docker reboot, or network disconnect). By deleting the pod, its controller restarts the pod and Kubernetes handles volume reattachment and remount.
+    description: 'If enabled, Longhorn will automatically delete the workload pod that is managed by a controller (e.g. deployment, statefulset, daemonset, etc...) when Longhorn volume is detached unexpectedly (e.g. during Kubernetes upgrade, Docker reboot, or network disconnect). By deleting the pod, its controller restarts the pod and Kubernetes handles volume reattachment and remount.'
 If disabled, Longhorn will not delete the workload pod that is managed by a controller. You will have to manually restart the pod to reattach and remount the volume.
 **Note:** This setting does not apply to the workload pods that do not have a controller. Longhorn never deletes them.'
    group: "Longhorn Default Settings"
    type: boolean
    default: "true"
@ -378,13 +405,27 @@ If disabled, Longhorn will not delete the workload pod that is managed by a cont
    group: "Longhorn Default Settings"
    type: boolean
    default: "true"
  - variable: defaultSettings.replicaDiskSoftAntiAffinity
    label: Replica Disk Level Soft Anti-Affinity
    description: 'Allow scheduling on disks with existing healthy replicas of the same volume. By default true.'
    group: "Longhorn Default Settings"
    type: boolean
    default: "true"
  - variable: defaultSettings.allowEmptyNodeSelectorVolume
    label: Allow Empty Node Selector Volume
    description: "Allow Scheduling Empty Node Selector Volumes To Any Node"
    group: "Longhorn Default Settings"
    type: boolean
    default: "true"
  - variable: defaultSettings.allowEmptyDiskSelectorVolume
    label: Allow Empty Disk Selector Volume
    description: "Allow Scheduling Empty Disk Selector Volumes To Any Disk"
    group: "Longhorn Default Settings"
    type: boolean
    default: "true"
  - variable: defaultSettings.nodeDownPodDeletionPolicy
    label: Pod Deletion Policy When Node is Down
-    description: "Defines the Longhorn action when a Volume is stuck with a StatefulSet/Deployment Pod on a node that is down.
+    description: "Defines the Longhorn action when a Volume is stuck with a StatefulSet/Deployment Pod on a node that is down."
 - **do-nothing** is the default Kubernetes behavior of never force deleting StatefulSet/Deployment terminating pods. Since the pod on the node that is down isn't removed, Longhorn volumes are stuck on nodes that are down.
 - **delete-statefulset-pod** Longhorn will force delete StatefulSet terminating pods on nodes that are down to release Longhorn volumes so that Kubernetes can spin up replacement pods.
 - **delete-deployment-pod** Longhorn will force delete Deployment terminating pods on nodes that are down to release Longhorn volumes so that Kubernetes can spin up replacement pods.
 - **delete-both-statefulset-and-deployment-pod** Longhorn will force delete StatefulSet/Deployment terminating pods on nodes that are down to release Longhorn volumes so that Kubernetes can spin up replacement pods."
    group: "Longhorn Default Settings"
    type: enum
    options:
@ -393,19 +434,9 @@ If disabled, Longhorn will not delete the workload pod that is managed by a cont
    - "delete-deployment-pod"
    - "delete-both-statefulset-and-deployment-pod"
    default: "do-nothing"
  - variable: defaultSettings.allowNodeDrainWithLastHealthyReplica
    label: Allow Node Drain with the Last Healthy Replica
    description: "By default, Longhorn will block `kubectl drain` action on a node if the node contains the last healthy replica of a volume.
 If this setting is enabled, Longhorn will **not** block `kubectl drain` action on a node even if the node contains the last healthy replica of a volume."
    group: "Longhorn Default Settings"
    type: boolean
    default: "false"
  - variable: defaultSettings.nodeDrainPolicy
    label: Node Drain Policy
-    description: "Define the policy to use when a node with the last healthy replica of a volume is drained.
+    description: "Define the policy to use when a node with the last healthy replica of a volume is drained."
 - **block-if-contains-last-replica** Longhorn will block the drain when the node contains the last healthy replica of a volume.
 - **allow-if-replica-is-stopped** Longhorn will allow the drain when the node contains the last healthy replica of a volume but the replica is stopped. WARNING: possible data loss if the node is removed after draining. Select this option if you want to drain the node and do in-place upgrade/maintenance.
 - **always-allow** Longhorn will allow the drain even though the node contains the last healthy replica of a volume. WARNING: possible data loss if the node is removed after draining. Also possible data corruption if the last replica was running during the draining."
    group: "Longhorn Default Settings"
    type: enum
    options:
@ -413,33 +444,23 @@ If this setting is enabled, Longhorn will **not** block `kubectl drain` action o
      - "allow-if-replica-is-stopped"
      - "always-allow"
    default: "block-if-contains-last-replica"
  - variable: defaultSettings.mkfsExt4Parameters
    label: Custom mkfs.ext4 parameters
    description: "Allows setting additional filesystem creation parameters for ext4. For older host kernels it might be necessary to disable the optional ext4 metadata_csum feature by specifying `-O ^64bit,^metadata_csum`."
    group: "Longhorn Default Settings"
    type: string
  - variable: defaultSettings.disableReplicaRebuild
    label: Disable Replica Rebuild
    description: "This setting disable replica rebuild cross the whole cluster, eviction and data locality feature won't work if this setting is true. But doesn't have any impact to any current replica rebuild and restore disaster recovery volume."
    group: "Longhorn Default Settings"
    type: boolean
    default: "false"
  - variable: defaultSettings.replicaReplenishmentWaitInterval
    label: Replica Replenishment Wait Interval
-    description: "In seconds. The interval determines how long Longhorn will wait at least in order to reuse the existing data on a failed replica rather than directly creating a new replica for a degraded volume.
+    description: "In seconds. The interval determines how long Longhorn will wait at least in order to reuse the existing data on a failed replica rather than directly creating a new replica for a degraded volume."
 Warning: This option works only when there is a failed replica in the volume. And this option may block the rebuilding for a while in the case."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    default: 600
  - variable: defaultSettings.concurrentReplicaRebuildPerNodeLimit
    label: Concurrent Replica Rebuild Per Node Limit
-    description: "This setting controls how many replicas on a node can be rebuilt simultaneously.
+    description: "This setting controls how many replicas on a node can be rebuilt simultaneously."
-Typically, Longhorn can block the replica starting once the current rebuilding count on a node exceeds the limit. But when the value is 0, it means disabling the replica rebuilding.
+    group: "Longhorn Default Settings"
-WARNING:
+    type: int
- The old setting \"Disable Replica Rebuild\" is replaced by this setting.
+    min: 0
- Different from relying on replica starting delay to limit the concurrent rebuilding, if the rebuilding is disabled, replica object replenishment will be directly skipped.
+    default: 5
- When the value is 0, the eviction and data locality feature won't work. But this shouldn't have any impact to any current replica rebuild and backup restore."
+  - variable: defaultSettings.concurrentVolumeBackupRestorePerNodeLimit
    label: Concurrent Volume Backup Restore Per Node Limit
    description: "This setting controls how many volumes on a node can restore the backup concurrently. Set the value to **0** to disable backup restore."
    group: "Longhorn Default Settings"
    type: int
    min: 0
@ -488,58 +509,28 @@ WARNING:
    default: 60
  - variable: defaultSettings.backingImageRecoveryWaitInterval
    label: Backing Image Recovery Wait Interval
-    description: "This interval in seconds determines how long Longhorn will wait before re-downloading the backing image file when all disk files of this backing image become failed or unknown.
+    description: "This interval in seconds determines how long Longhorn will wait before re-downloading the backing image file when all disk files of this backing image become failed or unknown."
    WARNING:
      - This recovery only works for the backing image of which the creation type is \"download\".
      - File state \"unknown\" means the related manager pods on the pod is not running or the node itself is down/disconnected."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    default: 300
-  - variable: defaultSettings.guaranteedEngineManagerCPU
+  - variable: defaultSettings.guaranteedInstanceManagerCPU
-    label: Guaranteed Engine Manager CPU
+    label: Guaranteed Instance Manager CPU
-    description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each engine manager Pod. For example, 10 means 10% of the total CPU on a node will be allocated to each engine manager pod on this node. This will help maintain engine stability during high node workload.
+    description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each instance manager Pod. You can leave it with the default value, which is 12%."
    In order to prevent unexpected volume engine crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
    Guaranteed Engine Manager CPU = The estimated max Longhorn volume engine count on a node * 0.1 / The total allocatable CPUs on the node * 100.
    The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
    If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
    WARNING:
      - Value 0 means unsetting CPU requests for engine manager pods.
      - Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Engine Manager CPU' should not be greater than 40.
      - One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
      - This global setting will be ignored for a node if the field \"EngineManagerCPURequest\" on the node is set.
      - After this setting is changed, all engine manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
    group: "Longhorn Default Settings"
    type: int
    min: 0
    max: 40
    default: 12
-  - variable: defaultSettings.guaranteedReplicaManagerCPU
+  - variable: defaultSettings.logLevel
-    label: Guaranteed Replica Manager CPU
+    label: Log Level
-    description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each replica manager Pod. 10 means 10% of the total CPU on a node will be allocated to each replica manager pod on this node. This will help maintain replica stability during high node workload.
+    description: "The log level Panic, Fatal, Error, Warn, Info, Debug, Trace used in longhorn manager. Default to Info."
    In order to prevent unexpected volume replica crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
    Guaranteed Replica Manager CPU = The estimated max Longhorn volume replica count on a node * 0.1 / The total allocatable CPUs on the node * 100.
    The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
    If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
    WARNING:
      - Value 0 means unsetting CPU requests for replica manager pods.
      - Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Replica Manager CPU' should not be greater than 40.
      - One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
      - This global setting will be ignored for a node if the field \"ReplicaManagerCPURequest\" on the node is set.
      - After this setting is changed, all replica manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
    group: "Longhorn Default Settings"
-    type: int
+    type: string
-    min: 0
+    default: "Info"
    max: 40
    default: 12
 - variable: defaultSettings.kubernetesClusterAutoscalerEnabled
  label: Kubernetes Cluster Autoscaler Enabled (Experimental)
-  description: "Enabling this setting will notify Longhorn that the cluster is using Kubernetes Cluster Autoscaler.
+  description: "Enabling this setting will notify Longhorn that the cluster is using Kubernetes Cluster Autoscaler."
  Longhorn prevents data loss by only allowing the Cluster Autoscaler to scale down a node that met all conditions:
    - No volume attached to the node.
    - Is not the last node containing the replica of any volume.
    - Is not running backing image components pod.
    - Is not running share manager components pod."
  group: "Longhorn Default Settings"
  type: boolean
  default: false
@ -551,15 +542,94 @@ WARNING:
  default: false
 - variable: defaultSettings.storageNetwork
  label: Storage Network
-  description: "Longhorn uses the storage network for in-cluster data traffic. Leave this blank to use the Kubernetes cluster network.
+  description: "Longhorn uses the storage network for in-cluster data traffic. Leave this blank to use the Kubernetes cluster network."
 	To segregate the storage network, input the pre-existing NetworkAttachmentDefinition in \"<namespace>/<name>\" format.
 	WARNING:
 	  - The cluster must have pre-existing Multus installed, and NetworkAttachmentDefinition IPs are reachable between nodes.
 	  - DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES. Longhorn will try to block this setting update when there are attached volumes.
 	  - When applying the setting, Longhorn will restart all manager, instance-manager, and backing-image-manager pods."
  group: "Longhorn Default Settings"
  type: string
  default:
 - variable: defaultSettings.deletingConfirmationFlag
  label: Deleting Confirmation Flag
  description: "This flag is designed to prevent Longhorn from being accidentally uninstalled which will lead to data lost."
  group: "Longhorn Default Settings"
  type: boolean
  default: "false"
 - variable: defaultSettings.engineReplicaTimeout
  label: Timeout between Engine and Replica
  description: "In seconds. The setting specifies the timeout between the engine and replica(s), and the value should be between 8 to 30 seconds. The default value is 8 seconds."
  group: "Longhorn Default Settings"
  type: int
  default: "8"
 - variable: defaultSettings.snapshotDataIntegrity
  label: Snapshot Data Integrity
  description: "This setting allows users to enable or disable snapshot hashing and data integrity checking."
  group: "Longhorn Default Settings"
  type: string
  default: "disabled"
 - variable: defaultSettings.snapshotDataIntegrityImmediateCheckAfterSnapshotCreation
  label: Immediate Snapshot Data Integrity Check After Creating a Snapshot
  description: "Hashing snapshot disk files impacts the performance of the system. The immediate snapshot hashing and checking can be disabled to minimize the impact after creating a snapshot."
  group: "Longhorn Default Settings"
  type: boolean
  default: "false"
 - variable: defaultSettings.snapshotDataIntegrityCronjob
  label: Snapshot Data Integrity Check CronJob
  description: "Unix-cron string format. The setting specifies when Longhorn checks the data integrity of snapshot disk files."
  group: "Longhorn Default Settings"
  type: string
  default: "0 0 */7 * *"
 - variable: defaultSettings.removeSnapshotsDuringFilesystemTrim
  label: Remove Snapshots During Filesystem Trim
  description: "This setting allows Longhorn filesystem trim feature to automatically mark the latest snapshot and its ancestors as removed and stops at the snapshot containing multiple children."
  group: "Longhorn Default Settings"
  type: boolean
  default: "false"
 - variable: defaultSettings.fastReplicaRebuildEnabled
  label: Fast Replica Rebuild Enabled
  description: "This feature supports the fast replica rebuilding. It relies on the checksum of snapshot disk files, so setting the snapshot-data-integrity to **enable** or **fast-check** is a prerequisite."
  group: "Longhorn Default Settings"
  type: boolean
  default: false
 - variable: defaultSettings.replicaFileSyncHttpClientTimeout
  label: Timeout of HTTP Client to Replica File Sync Server
  description: "In seconds. The setting specifies the HTTP client timeout to the file sync server."
  group: "Longhorn Default Settings"
  type: int
  default: "30"
 - variable: defaultSettings.backupCompressionMethod
  label: Backup Compression Method
  description: "This setting allows users to specify backup compression method."
  group: "Longhorn Default Settings"
  type: string
  default: "lz4"
 - variable: defaultSettings.backupConcurrentLimit
  label: Backup Concurrent Limit Per Backup
  description: "This setting controls how many worker threads per backup concurrently."
  group: "Longhorn Default Settings"
  type: int
  min: 1
  default: 2
 - variable: defaultSettings.restoreConcurrentLimit
  label: Restore Concurrent Limit Per Backup
  description: "This setting controls how many worker threads per restore concurrently."
  group: "Longhorn Default Settings"
  type: int
  min: 1
  default: 2
 - variable: defaultSettings.v2DataEngine
  label: V2 Data Engine
  description: "This allows users to activate v2 data engine based on SPDK. Currently, it is in the preview phase and should not be utilized in a production environment."
  group: "Longhorn V2 Data Engine (Preview Feature) Settings"
  type: boolean
  default: false
 - variable: defaultSettings.offlineReplicaRebuilding
  label: Offline Replica Rebuilding
  description: "This setting allows users to enable the offline replica rebuilding for volumes using v2 data engine."
  group: "Longhorn V2 Data Engine (Preview Feature) Settings"
  required: true
  type: enum
  options:
  - "enabled"
  - "disabled"
  default: "enabled"
 - variable: persistence.defaultClass
  default: "true"
  description: "Set as default StorageClass for Longhorn"
@ -569,7 +639,7 @@ WARNING:
  type: boolean
 - variable: persistence.reclaimPolicy
  label: Storage Class Retain Policy
-  description: "Define reclaim policy (Retain or Delete)"
+  description: "Define reclaim policy. Options: `Retain`, `Delete`"
  group: "Longhorn Storage Class Settings"
  required: true
  type: enum
@ -586,7 +656,7 @@ WARNING:
  max: 10
  default: 3
 - variable: persistence.defaultDataLocality
-  description: "Set data locality for Longhorn StorageClass"
+  description: "Set data locality for Longhorn StorageClass. Options: `disabled`, `best-effort`"
  label: Default Storage Class Data Locality
  group: "Longhorn Storage Class Settings"
  type: enum
@ -608,6 +678,20 @@ WARNING:
    group: "Longhorn Storage Class Settings"
    type: string
    default:
 - variable: persistence.defaultNodeSelector.enable
  description: "Enable Node selector for Longhorn StorageClass"
  group: "Longhorn Storage Class Settings"
  label: Enable Storage Class Node Selector
  type: boolean
  default: false
  show_subquestion_if: true
  subquestions:
  - variable: persistence.defaultNodeSelector.selector
    label: Storage Class Node Selector
    description: 'This selector enables only certain nodes having these tags to be used for the volume. e.g. `"storage,fast"`'
    group: "Longhorn Storage Class Settings"
    type: string
    default:
 - variable: persistence.backingImage.enable
  description: "Set backing image for Longhorn StorageClass"
  group: "Longhorn Storage Class Settings"
@ -657,6 +741,16 @@ WARNING:
    group: "Longhorn Storage Class Settings"
    type: string
    default:
 - variable: persistence.removeSnapshotsDuringFilesystemTrim
  description: "Allow automatically removing snapshots during filesystem trim for Longhorn StorageClass. Options: `ignored`, `enabled`, `disabled`"
  label: Default Storage Class Remove Snapshots During Filesystem Trim
  group: "Longhorn Storage Class Settings"
  type: enum
  options:
  - "ignored"
  - "enabled"
  - "disabled"
  default: "ignored"
 - variable: ingress.enabled
  default: "false"
  description: "Expose app using Layer 7 Load Balancer - ingress"
@ -679,7 +773,7 @@ WARNING:
    label: Ingress Path
 - variable: service.ui.type
  default: "Rancher-Proxy"
-  description: "Define Longhorn UI service type"
+  description: "Define Longhorn UI service type. Options: `ClusterIP`, `NodePort`, `LoadBalancer`, `Rancher-Proxy`"
  type: enum
  options:
    - "ClusterIP"
@ -700,7 +794,7 @@ WARNING:
    show_if: "service.ui.type=NodePort||service.ui.type=LoadBalancer"
    label: UI Service NodePort number
 - variable: enablePSP
-  default: "true"
+  default: "false"
  description: "Setup a pod security policy for Longhorn workloads."
  label: Pod Security Policy
  type: boolean
@ -711,3 +805,21 @@ WARNING:
  label: Rancher Windows Cluster
  type: boolean
  group: "Other Settings"
 - variable: networkPolicies.enabled
  description: "Enable NetworkPolicies to limit access to the longhorn pods.
  Warning: The Rancher Proxy will not work if this feature is enabled and a custom NetworkPolicy must be added."
  group: "Other Settings"
  label: Network Policies
  default: "false"
  type: boolean
  subquestions:
  - variable: networkPolicies.type
    label: Network Policies for Ingress
    description: "Create the policy based on your distribution to allow access for the ingress. Options: `k3s`, `rke2`, `rke1`"
    show_if: "networkPolicies.enabled=true&&ingress.enabled=true"
    type: enum
    default: "rke2"
    options:
      - "rke1"
      - "rke2"
      - "k3s"
--- a/chart/templates/clusterrole.yaml
+++ b/chart/templates/clusterrole.yaml
@ -11,7 +11,7 @@ rules:
  verbs:
  - "*"
 - apiGroups: [""]
-  resources: ["pods", "events", "persistentvolumes", "persistentvolumeclaims","persistentvolumeclaims/status", "nodes", "proxy/nodes", "pods/log", "secrets", "services", "endpoints", "configmaps"]
+  resources: ["pods", "events", "persistentvolumes", "persistentvolumeclaims","persistentvolumeclaims/status", "nodes", "proxy/nodes", "pods/log", "secrets", "services", "endpoints", "configmaps", "serviceaccounts"]
  verbs: ["*"]
 - apiGroups: [""]
  resources: ["namespaces"]
@ -23,7 +23,7 @@ rules:
  resources: ["jobs", "cronjobs"]
  verbs: ["*"]
 - apiGroups: ["policy"]
-  resources: ["poddisruptionbudgets"]
+  resources: ["poddisruptionbudgets", "podsecuritypolicies"]
  verbs: ["*"]
 - apiGroups: ["scheduling.k8s.io"]
  resources: ["priorityclasses"]
@ -37,10 +37,15 @@ rules:
 - apiGroups: ["longhorn.io"]
  resources: ["volumes", "volumes/status", "engines", "engines/status", "replicas", "replicas/status", "settings",
              "engineimages", "engineimages/status", "nodes", "nodes/status", "instancemanagers", "instancemanagers/status",
  {{- if .Values.openshift.enabled }}
              "engineimages/finalizers", "nodes/finalizers", "instancemanagers/finalizers",
  {{- end }}
              "sharemanagers", "sharemanagers/status", "backingimages", "backingimages/status",
              "backingimagemanagers", "backingimagemanagers/status", "backingimagedatasources", "backingimagedatasources/status",
              "backuptargets", "backuptargets/status", "backupvolumes", "backupvolumes/status", "backups", "backups/status",
-              "recurringjobs", "recurringjobs/status", "orphans", "orphans/status", "snapshots", "snapshots/status"]
+              "recurringjobs", "recurringjobs/status", "orphans", "orphans/status", "snapshots", "snapshots/status",
              "supportbundles", "supportbundles/status", "systembackups", "systembackups/status", "systemrestores", "systemrestores/status",
              "volumeattachments", "volumeattachments/status"]
  verbs: ["*"]
 - apiGroups: ["coordination.k8s.io"]
  resources: ["leases"]
@ -54,3 +59,19 @@ rules:
 - apiGroups: ["admissionregistration.k8s.io"]
  resources: ["mutatingwebhookconfigurations", "validatingwebhookconfigurations"]
  verbs: ["get", "list", "create", "patch", "delete"]
 - apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["roles", "rolebindings", "clusterrolebindings", "clusterroles"]
  verbs: ["*"]
 {{- if .Values.openshift.enabled }}
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRole
 metadata:
  name: longhorn-ocp-privileged-role
  labels: {{- include "longhorn.labels" . | nindent 4 }}
 rules:
 - apiGroups: ["security.openshift.io"]
  resources: ["securitycontextconstraints"]
  resourceNames: ["anyuid", "privileged"]
  verbs: ["use"]
 {{- end }}
--- a/chart/templates/clusterrolebinding.yaml
+++ b/chart/templates/clusterrolebinding.yaml
@ -11,3 +11,39 @@ subjects:
 - kind: ServiceAccount
  name: longhorn-service-account
  namespace: {{ include "release_namespace" . }}
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
  name: longhorn-support-bundle
  labels: {{- include "longhorn.labels" . | nindent 4 }}
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
 subjects:
 - kind: ServiceAccount
  name: longhorn-support-bundle
  namespace: {{ include "release_namespace" . }}
 {{- if .Values.openshift.enabled }}
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
  name: longhorn-ocp-privileged-bind
  labels: {{- include "longhorn.labels" . | nindent 4 }}
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: longhorn-ocp-privileged-role
 subjects:
 - kind: ServiceAccount
  name: longhorn-service-account
  namespace: {{ include "release_namespace" . }}
 - kind: ServiceAccount
  name: longhorn-ui-service-account
  namespace: {{ include "release_namespace" . }}
 - kind: ServiceAccount
  name: default # supportbundle-agent-support-bundle uses default sa
  namespace: {{ include "release_namespace" . }}
 {{- end }}
--- a/chart/templates/crds.yaml
+++ b/chart/templates/crds.yaml
--- a/chart/templates/daemonset-sa.yaml
+++ b/chart/templates/daemonset-sa.yaml
@ -18,10 +18,6 @@ spec:
        {{- toYaml . | nindent 8 }}
      {{- end }}
    spec:
      initContainers:
      - name: wait-longhorn-admission-webhook
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
        command: ['sh', '-c', 'while [ $(curl -m 1 -s -o /dev/null -w "%{http_code}" -k https://longhorn-admission-webhook:9443/v1/healthz) != "200" ]; do echo waiting; sleep 2; done']
      containers:
      - name: longhorn-manager
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
@ -43,6 +39,8 @@ spec:
        - "{{ template "registry_url" . }}{{ .Values.image.longhorn.shareManager.repository }}:{{ .Values.image.longhorn.shareManager.tag }}"
        - --backing-image-manager-image
        - "{{ template "registry_url" . }}{{ .Values.image.longhorn.backingImageManager.repository }}:{{ .Values.image.longhorn.backingImageManager.tag }}"
        - --support-bundle-manager-image
        - "{{ template "registry_url" . }}{{ .Values.image.longhorn.supportBundleKit.repository }}:{{ .Values.image.longhorn.supportBundleKit.tag }}"
        - --manager-image
        - "{{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}"
        - --service-account
@ -50,9 +48,17 @@ spec:
        ports:
        - containerPort: 9500
          name: manager
        - containerPort: 9501
          name: conversion-wh
        - containerPort: 9502
          name: admission-wh
        - containerPort: 9503
          name: recov-backend
        readinessProbe:
-          tcpSocket:
+          httpGet:
-            port: 9500
+            path: /v1/healthz
            port: 9501
            scheme: HTTPS
        volumeMounts:
        - name: dev
          mountPath: /host/dev/
--- a/chart/templates/default-setting.yaml
+++ b/chart/templates/default-setting.yaml
@ -15,12 +15,17 @@ data:
    {{ if not (kindIs "invalid" .Values.defaultSettings.replicaAutoBalance) }}replica-auto-balance: {{ .Values.defaultSettings.replicaAutoBalance }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.storageOverProvisioningPercentage) }}storage-over-provisioning-percentage: {{ .Values.defaultSettings.storageOverProvisioningPercentage }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.storageMinimalAvailablePercentage) }}storage-minimal-available-percentage: {{ .Values.defaultSettings.storageMinimalAvailablePercentage }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.storageReservedPercentageForDefaultDisk) }}storage-reserved-percentage-for-default-disk: {{ .Values.defaultSettings.storageReservedPercentageForDefaultDisk }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.upgradeChecker) }}upgrade-checker: {{ .Values.defaultSettings.upgradeChecker }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.defaultReplicaCount) }}default-replica-count: {{ .Values.defaultSettings.defaultReplicaCount }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.defaultDataLocality) }}default-data-locality: {{ .Values.defaultSettings.defaultDataLocality }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.defaultLonghornStaticStorageClass) }}default-longhorn-static-storage-class: {{ .Values.defaultSettings.defaultLonghornStaticStorageClass }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.backupstorePollInterval) }}backupstore-poll-interval: {{ .Values.defaultSettings.backupstorePollInterval }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.failedBackupTTL) }}failed-backup-ttl: {{ .Values.defaultSettings.failedBackupTTL }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.restoreVolumeRecurringJobs) }}restore-volume-recurring-jobs: {{ .Values.defaultSettings.restoreVolumeRecurringJobs }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.recurringSuccessfulJobsHistoryLimit) }}recurring-successful-jobs-history-limit: {{ .Values.defaultSettings.recurringSuccessfulJobsHistoryLimit }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.recurringFailedJobsHistoryLimit) }}recurring-failed-jobs-history-limit: {{ .Values.defaultSettings.recurringFailedJobsHistoryLimit }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.supportBundleFailedHistoryLimit) }}support-bundle-failed-history-limit: {{ .Values.defaultSettings.supportBundleFailedHistoryLimit }}{{ end }}
    {{- if or (not (kindIs "invalid" .Values.defaultSettings.taintToleration)) (.Values.global.cattle.windowsCluster.enabled) }}
    taint-toleration: {{ $windowsDefaultSettingTaintToleration := list }}{{ $defaultSettingTaintToleration := list -}}
      {{- if and .Values.global.cattle.windowsCluster.enabled .Values.global.cattle.windowsCluster.defaultSetting.taintToleration -}}
@ -46,13 +51,12 @@ data:
    {{ if not (kindIs "invalid" .Values.defaultSettings.autoDeletePodWhenVolumeDetachedUnexpectedly) }}auto-delete-pod-when-volume-detached-unexpectedly: {{ .Values.defaultSettings.autoDeletePodWhenVolumeDetachedUnexpectedly }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.disableSchedulingOnCordonedNode) }}disable-scheduling-on-cordoned-node: {{ .Values.defaultSettings.disableSchedulingOnCordonedNode }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.replicaZoneSoftAntiAffinity) }}replica-zone-soft-anti-affinity: {{ .Values.defaultSettings.replicaZoneSoftAntiAffinity }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.replicaDiskSoftAntiAffinity) }}replica-disk-soft-anti-affinity: {{ .Values.defaultSettings.replicaDiskSoftAntiAffinity }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.nodeDownPodDeletionPolicy) }}node-down-pod-deletion-policy: {{ .Values.defaultSettings.nodeDownPodDeletionPolicy }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.allowNodeDrainWithLastHealthyReplica) }}allow-node-drain-with-last-healthy-replica: {{ .Values.defaultSettings.allowNodeDrainWithLastHealthyReplica }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.nodeDrainPolicy) }}node-drain-policy: {{ .Values.defaultSettings.nodeDrainPolicy }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.mkfsExt4Parameters) }}mkfs-ext4-parameters: {{ .Values.defaultSettings.mkfsExt4Parameters }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.disableReplicaRebuild) }}disable-replica-rebuild: {{ .Values.defaultSettings.disableReplicaRebuild }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.replicaReplenishmentWaitInterval) }}replica-replenishment-wait-interval: {{ .Values.defaultSettings.replicaReplenishmentWaitInterval }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.concurrentReplicaRebuildPerNodeLimit) }}concurrent-replica-rebuild-per-node-limit: {{ .Values.defaultSettings.concurrentReplicaRebuildPerNodeLimit }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.concurrentVolumeBackupRestorePerNodeLimit) }}concurrent-volume-backup-restore-per-node-limit: {{ .Values.defaultSettings.concurrentVolumeBackupRestorePerNodeLimit }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.disableRevisionCounter) }}disable-revision-counter: {{ .Values.defaultSettings.disableRevisionCounter }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.systemManagedPodsImagePullPolicy) }}system-managed-pods-image-pull-policy: {{ .Values.defaultSettings.systemManagedPodsImagePullPolicy }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.allowVolumeCreationWithDegradedAvailability) }}allow-volume-creation-with-degraded-availability: {{ .Values.defaultSettings.allowVolumeCreationWithDegradedAvailability }}{{ end }}
@ -60,8 +64,23 @@ data:
    {{ if not (kindIs "invalid" .Values.defaultSettings.concurrentAutomaticEngineUpgradePerNodeLimit) }}concurrent-automatic-engine-upgrade-per-node-limit: {{ .Values.defaultSettings.concurrentAutomaticEngineUpgradePerNodeLimit }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.backingImageCleanupWaitInterval) }}backing-image-cleanup-wait-interval: {{ .Values.defaultSettings.backingImageCleanupWaitInterval }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.backingImageRecoveryWaitInterval) }}backing-image-recovery-wait-interval: {{ .Values.defaultSettings.backingImageRecoveryWaitInterval }}{{ end }}
-    {{ if not (kindIs "invalid" .Values.defaultSettings.guaranteedEngineManagerCPU) }}guaranteed-engine-manager-cpu: {{ .Values.defaultSettings.guaranteedEngineManagerCPU }}{{ end }}
+    {{ if not (kindIs "invalid" .Values.defaultSettings.guaranteedInstanceManagerCPU) }}guaranteed-instance-manager-cpu: {{ .Values.defaultSettings.guaranteedInstanceManagerCPU }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.guaranteedReplicaManagerCPU) }}guaranteed-replica-manager-cpu: {{ .Values.defaultSettings.guaranteedReplicaManagerCPU }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.kubernetesClusterAutoscalerEnabled) }}kubernetes-cluster-autoscaler-enabled: {{ .Values.defaultSettings.kubernetesClusterAutoscalerEnabled }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.orphanAutoDeletion) }}orphan-auto-deletion: {{ .Values.defaultSettings.orphanAutoDeletion }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.storageNetwork) }}storage-network: {{ .Values.defaultSettings.storageNetwork }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.deletingConfirmationFlag) }}deleting-confirmation-flag: {{ .Values.defaultSettings.deletingConfirmationFlag }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.engineReplicaTimeout) }}engine-replica-timeout: {{ .Values.defaultSettings.engineReplicaTimeout }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.snapshotDataIntegrity) }}snapshot-data-integrity: {{ .Values.defaultSettings.snapshotDataIntegrity }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.snapshotDataIntegrityImmediateCheckAfterSnapshotCreation) }}snapshot-data-integrity-immediate-check-after-snapshot-creation: {{ .Values.defaultSettings.snapshotDataIntegrityImmediateCheckAfterSnapshotCreation }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.snapshotDataIntegrityCronjob) }}snapshot-data-integrity-cronjob: {{ .Values.defaultSettings.snapshotDataIntegrityCronjob }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.removeSnapshotsDuringFilesystemTrim) }}remove-snapshots-during-filesystem-trim: {{ .Values.defaultSettings.removeSnapshotsDuringFilesystemTrim }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.fastReplicaRebuildEnabled) }}fast-replica-rebuild-enabled: {{ .Values.defaultSettings.fastReplicaRebuildEnabled }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.replicaFileSyncHttpClientTimeout) }}replica-file-sync-http-client-timeout: {{ .Values.defaultSettings.replicaFileSyncHttpClientTimeout }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.logLevel) }}log-level: {{ .Values.defaultSettings.logLevel }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.backupCompressionMethod) }}backup-compression-method: {{ .Values.defaultSettings.backupCompressionMethod }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.backupConcurrentLimit) }}backup-concurrent-limit: {{ .Values.defaultSettings.backupConcurrentLimit }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.restoreConcurrentLimit) }}restore-concurrent-limit: {{ .Values.defaultSettings.restoreConcurrentLimit }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.v2DataEngine) }}v2-data-engine: {{ .Values.defaultSettings.v2DataEngine }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.offlineReplicaRebuilding) }}offline-replica-rebuilding: {{ .Values.defaultSettings.offlineReplicaRebuilding }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.allowEmptyNodeSelectorVolume) }}allow-empty-node-selector-volume: {{ .Values.defaultSettings.allowEmptyNodeSelectorVolume }}{{ end }}
    {{ if not (kindIs "invalid" .Values.defaultSettings.allowEmptyDiskSelectorVolume) }}allow-empty-disk-selector-volume: {{ .Values.defaultSettings.allowEmptyDiskSelectorVolume }}{{ end }}
--- a/chart/templates/deployment-ui.yaml
+++ b/chart/templates/deployment-ui.yaml
@ -1,3 +1,41 @@
 {{- if .Values.openshift.enabled }}
 {{- if .Values.openshift.ui.route }}
 # https://github.com/openshift/oauth-proxy/blob/master/contrib/sidecar.yaml
 # Create a proxy service account and ensure it will use the route "proxy"
 # Create a secure connection to the proxy via a route
 apiVersion: route.openshift.io/v1
 kind: Route
 metadata:
  labels: {{- include "longhorn.labels" . | nindent 4 }}
    app: longhorn-ui
  name: {{ .Values.openshift.ui.route }}
  namespace: {{ include "release_namespace" . }}
 spec:
  to:
    kind: Service
    name: longhorn-ui
  tls:
    termination: reencrypt
 ---
 apiVersion: v1
 kind: Service
 metadata:
  labels: {{- include "longhorn.labels" . | nindent 4 }}
    app: longhorn-ui
  name: longhorn-ui
  namespace: {{ include "release_namespace" . }}
  annotations:
    service.alpha.openshift.io/serving-cert-secret-name: longhorn-ui-tls
 spec:
  ports:
  - name: longhorn-ui
    port: {{ .Values.openshift.ui.port | default 443 }}
    targetPort: {{ .Values.openshift.ui.proxy | default 8443 }}
  selector:
    app: longhorn-ui
 ---
 {{- end }}
 {{- end }}
 apiVersion: apps/v1
 kind: Deployment
 metadata:
@ -15,6 +53,7 @@ spec:
      labels: {{- include "longhorn.labels" . | nindent 8 }}
        app: longhorn-ui
    spec:
      serviceAccountName: longhorn-ui-service-account
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
@ -28,6 +67,28 @@ spec:
                  - longhorn-ui
              topologyKey: kubernetes.io/hostname
      containers:
      {{- if .Values.openshift.enabled }}
      {{- if .Values.openshift.ui.route }}
      - name: oauth-proxy
        image: {{ template "registry_url" . }}{{ .Values.image.openshift.oauthProxy.repository }}:{{ .Values.image.openshift.oauthProxy.tag }}
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: {{ .Values.openshift.ui.proxy | default 8443 }}
          name: public
        args:
        - --https-address=:{{ .Values.openshift.ui.proxy | default 8443 }}
        - --provider=openshift
        - --openshift-service-account=longhorn-ui-service-account
        - --upstream=http://localhost:8000
        - --tls-cert=/etc/tls/private/tls.crt
        - --tls-key=/etc/tls/private/tls.key
        - --cookie-secret=SECRET
        - --openshift-sar={"namespace":"{{ include "release_namespace" . }}","group":"longhorn.io","resource":"setting","verb":"delete"}
        volumeMounts:
          - mountPath: /etc/tls/private
            name: longhorn-ui-tls
      {{- end }}
      {{- end }}
      - name: longhorn-ui
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.ui.repository }}:{{ .Values.image.longhorn.ui.tag }}
        imagePullPolicy: {{ .Values.image.pullPolicy }}
@ -47,6 +108,13 @@ spec:
          - name: LONGHORN_UI_PORT
            value: "8000"
      volumes:
      {{- if .Values.openshift.enabled }}
      {{- if .Values.openshift.ui.route }}
      - name: longhorn-ui-tls
        secret:
          secretName: longhorn-ui-tls
      {{- end }}
      {{- end }}
      - emptyDir: {}
        name: nginx-cache
      - emptyDir: {}
--- a/chart/templates/deployment-webhook.yaml
+++ b/chart/templates/deployment-webhook.yaml
@ -1,166 +0,0 @@
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  labels: {{- include "longhorn.labels" . | nindent 4 }}
    app: longhorn-conversion-webhook
  name: longhorn-conversion-webhook
  namespace: {{ include "release_namespace" . }}
 spec:
  replicas: 2
  selector:
    matchLabels:
      app: longhorn-conversion-webhook
  template:
    metadata:
      labels: {{- include "longhorn.labels" . | nindent 8 }}
        app: longhorn-conversion-webhook
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - longhorn-conversion-webhook
              topologyKey: kubernetes.io/hostname
      containers:
      - name: longhorn-conversion-webhook
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        securityContext:
          runAsUser: 2000
        command:
        - longhorn-manager
        - conversion-webhook
        - --service-account
        - longhorn-service-account
        ports:
        - containerPort: 9443
          name: conversion-wh
        readinessProbe:
          tcpSocket:
            port: 9443
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      {{- if .Values.privateRegistry.registrySecret }}
      imagePullSecrets:
      - name: {{ .Values.privateRegistry.registrySecret }}
      {{- end }}
      {{- if .Values.longhornDriver.priorityClass }}
      priorityClassName: {{ .Values.longhornDriver.priorityClass | quote }}
      {{- end }}
      {{- if or .Values.longhornDriver.tolerations .Values.global.cattle.windowsCluster.enabled }}
      tolerations:
        {{- if and .Values.global.cattle.windowsCluster.enabled .Values.global.cattle.windowsCluster.tolerations }}
 {{ toYaml .Values.global.cattle.windowsCluster.tolerations | indent 6 }}
        {{- end }}
        {{- if .Values.longhornDriver.tolerations }}
 {{ toYaml .Values.longhornDriver.tolerations | indent 6 }}
        {{- end }}
      {{- end }}
      {{- if or .Values.longhornDriver.nodeSelector .Values.global.cattle.windowsCluster.enabled }}
      nodeSelector:
        {{- if and .Values.global.cattle.windowsCluster.enabled .Values.global.cattle.windowsCluster.nodeSelector }}
 {{ toYaml .Values.global.cattle.windowsCluster.nodeSelector | indent 8 }}
        {{- end }}
        {{- if .Values.longhornDriver.nodeSelector }}
 {{ toYaml .Values.longhornDriver.nodeSelector | indent 8 }}
        {{- end }}
      {{- end }}
      serviceAccountName: longhorn-service-account
 ---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  labels: {{- include "longhorn.labels" . | nindent 4 }}
    app: longhorn-admission-webhook
  name: longhorn-admission-webhook
  namespace: {{ include "release_namespace" . }}
 spec:
  replicas: 2
  selector:
    matchLabels:
      app: longhorn-admission-webhook
  template:
    metadata:
      labels: {{- include "longhorn.labels" . | nindent 8 }}
        app: longhorn-admission-webhook
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - longhorn-admission-webhook
              topologyKey: kubernetes.io/hostname
      initContainers:
      - name: wait-longhorn-conversion-webhook
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
        command: ['sh', '-c', 'while [ $(curl -m 1 -s -o /dev/null -w "%{http_code}" -k https://longhorn-conversion-webhook:9443/v1/healthz) != "200" ]; do echo waiting; sleep 2; done']
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        securityContext:
          runAsUser: 2000
      containers:
      - name: longhorn-admission-webhook
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        securityContext:
          runAsUser: 2000
        command:
        - longhorn-manager
        - admission-webhook
        - --service-account
        - longhorn-service-account
        ports:
        - containerPort: 9443
          name: admission-wh
        readinessProbe:
          tcpSocket:
            port: 9443
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
      {{- if .Values.privateRegistry.registrySecret }}
      imagePullSecrets:
      - name: {{ .Values.privateRegistry.registrySecret }}
      {{- end }}
      {{- if .Values.longhornDriver.priorityClass }}
      priorityClassName: {{ .Values.longhornDriver.priorityClass | quote }}
      {{- end }}
      {{- if or .Values.longhornDriver.tolerations .Values.global.cattle.windowsCluster.enabled }}
      tolerations:
        {{- if and .Values.global.cattle.windowsCluster.enabled .Values.global.cattle.windowsCluster.tolerations }}
 {{ toYaml .Values.global.cattle.windowsCluster.tolerations | indent 6 }}
        {{- end }}
        {{- if .Values.longhornDriver.tolerations }}
 {{ toYaml .Values.longhornDriver.tolerations | indent 6 }}
        {{- end }}
      {{- end }}
      {{- if or .Values.longhornDriver.nodeSelector .Values.global.cattle.windowsCluster.enabled }}
      nodeSelector:
        {{- if and .Values.global.cattle.windowsCluster.enabled .Values.global.cattle.windowsCluster.nodeSelector }}
 {{ toYaml .Values.global.cattle.windowsCluster.nodeSelector | indent 8 }}
        {{- end }}
        {{- if or .Values.longhornDriver.nodeSelector }}
 {{ toYaml .Values.longhornDriver.nodeSelector | indent 8 }}
        {{- end }}
      {{- end }}
      serviceAccountName: longhorn-service-account
--- a/chart/templates/network-policies/backing-image-data-source-network-policy.yaml
+++ b/chart/templates/network-policies/backing-image-data-source-network-policy.yaml
@ -0,0 +1,27 @@
 {{- if .Values.networkPolicies.enabled }}
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: backing-image-data-source
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      longhorn.io/component: backing-image-data-source
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: longhorn-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: instance-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: backing-image-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: backing-image-data-source
 {{- end }}
--- a/chart/templates/network-policies/backing-image-manager-network-policy.yaml
+++ b/chart/templates/network-policies/backing-image-manager-network-policy.yaml
@ -0,0 +1,27 @@
 {{- if .Values.networkPolicies.enabled }}
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: backing-image-manager
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      longhorn.io/component: backing-image-manager
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: longhorn-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: instance-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: backing-image-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: backing-image-data-source
 {{- end }}
--- a/chart/templates/network-policies/instance-manager-networking.yaml
+++ b/chart/templates/network-policies/instance-manager-networking.yaml
@ -0,0 +1,27 @@
 {{- if .Values.networkPolicies.enabled }}
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: instance-manager
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      longhorn.io/component: instance-manager
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: longhorn-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: instance-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: backing-image-manager
    - podSelector:
        matchLabels:
          longhorn.io/component: backing-image-data-source
 {{- end }}
--- a/chart/templates/network-policies/manager-network-policy.yaml
+++ b/chart/templates/network-policies/manager-network-policy.yaml
@ -0,0 +1,35 @@
 {{- if .Values.networkPolicies.enabled }}
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: longhorn-manager
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      app: longhorn-manager
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: longhorn-manager
    - podSelector:
        matchLabels:
          app: longhorn-ui
    - podSelector:
        matchLabels:
          app: longhorn-csi-plugin
    - podSelector:
        matchLabels:
          longhorn.io/managed-by: longhorn-manager
        matchExpressions:
          - { key: recurring-job.longhorn.io, operator: Exists }
    - podSelector:
        matchExpressions:
          - { key: longhorn.io/job-task, operator: Exists }
    - podSelector:
        matchLabels:
          app: longhorn-driver-deployer
 {{- end }}
--- a/chart/templates/network-policies/recovery-backend-network-policy.yaml
+++ b/chart/templates/network-policies/recovery-backend-network-policy.yaml
@ -0,0 +1,17 @@
 {{- if .Values.networkPolicies.enabled }}
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: longhorn-recovery-backend
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      app: longhorn-manager
  policyTypes:
  - Ingress
  ingress:
  - ports:
    - protocol: TCP
      port: 9503
 {{- end }}
--- a/chart/templates/network-policies/ui-frontend-network-policy.yaml
+++ b/chart/templates/network-policies/ui-frontend-network-policy.yaml
@ -0,0 +1,46 @@
 {{- if and .Values.networkPolicies.enabled .Values.ingress.enabled (not (eq .Values.networkPolicies.type "")) }}
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: longhorn-ui-frontend
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      app: longhorn-ui
  policyTypes:
  - Ingress
  ingress:
  - from:
    {{- if eq .Values.networkPolicies.type "rke1"}}
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: ingress-nginx
      podSelector:
        matchLabels:
          app.kubernetes.io/component: controller
          app.kubernetes.io/instance: ingress-nginx
          app.kubernetes.io/name: ingress-nginx
    {{- else if eq .Values.networkPolicies.type "rke2" }}
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels:
          app.kubernetes.io/component: controller
          app.kubernetes.io/instance: rke2-ingress-nginx
          app.kubernetes.io/name: rke2-ingress-nginx
    {{- else if eq .Values.networkPolicies.type "k3s" }}
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels:
          app.kubernetes.io/name: traefik
    ports:
      - port: 8000
        protocol: TCP
      - port: 80
        protocol: TCP
    {{- end }}
 {{- end }}
--- a/chart/templates/network-policies/webhook-network-policy.yaml
+++ b/chart/templates/network-policies/webhook-network-policy.yaml
@ -0,0 +1,33 @@
 {{- if .Values.networkPolicies.enabled }}
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: longhorn-conversion-webhook
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      app: longhorn-manager
  policyTypes:
  - Ingress
  ingress:
  - ports:
    - protocol: TCP
      port: 9501
 ---
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
 metadata:
  name: longhorn-admission-webhook
  namespace: longhorn-system
 spec:
  podSelector:
    matchLabels:
      app: longhorn-manager
  policyTypes:
  - Ingress
  ingress:
  - ports:
    - protocol: TCP
      port: 9502
 {{- end }}
--- a/chart/templates/postupgrade-job.yaml
+++ b/chart/templates/postupgrade-job.yaml
@ -19,8 +19,6 @@ spec:
      - name: longhorn-post-upgrade
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        securityContext:
          privileged: true
        command:
        - longhorn-manager
        - post-upgrade
--- a/chart/templates/preupgrade-job.yaml
+++ b/chart/templates/preupgrade-job.yaml
@ -0,0 +1,58 @@
 {{- if .Values.helmPreUpgradeCheckerJob.enabled }}
 apiVersion: batch/v1
 kind: Job
 metadata:
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation,hook-failed
  name: longhorn-pre-upgrade
  namespace: {{ include "release_namespace" . }}
  labels: {{- include "longhorn.labels" . | nindent 4 }}
 spec:
  activeDeadlineSeconds: 900
  backoffLimit: 1
  template:
    metadata:
      name: longhorn-pre-upgrade
      labels: {{- include "longhorn.labels" . | nindent 8 }}
    spec:
      containers:
      - name: longhorn-pre-upgrade
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        command:
        - longhorn-manager
        - pre-upgrade
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      restartPolicy: OnFailure
      {{- if .Values.privateRegistry.registrySecret }}
      imagePullSecrets:
      - name: {{ .Values.privateRegistry.registrySecret }}
      {{- end }}
      {{- if .Values.longhornManager.priorityClass }}
      priorityClassName: {{ .Values.longhornManager.priorityClass | quote }}
      {{- end }}
      serviceAccountName: longhorn-service-account
      {{- if or .Values.longhornManager.tolerations .Values.global.cattle.windowsCluster.enabled }}
      tolerations:
        {{- if and .Values.global.cattle.windowsCluster.enabled .Values.global.cattle.windowsCluster.tolerations }}
 {{ toYaml .Values.global.cattle.windowsCluster.tolerations | indent 6 }}
        {{- end }}
        {{- if .Values.longhornManager.tolerations }}
 {{ toYaml .Values.longhornManager.tolerations | indent 6 }}
        {{- end }}
      {{- end }}
      {{- if or .Values.longhornManager.nodeSelector .Values.global.cattle.windowsCluster.enabled }}
      nodeSelector:
        {{- if and .Values.global.cattle.windowsCluster.enabled .Values.global.cattle.windowsCluster.nodeSelector }}
 {{ toYaml .Values.global.cattle.windowsCluster.nodeSelector | indent 8 }}
        {{- end }}
        {{- if .Values.longhornManager.nodeSelector }}
 {{ toYaml .Values.longhornManager.nodeSelector | indent 8 }}
        {{- end }}
      {{- end }}
 {{- end }}
--- a/chart/templates/serviceaccount.yaml
+++ b/chart/templates/serviceaccount.yaml
@ -5,6 +5,36 @@ metadata:
  namespace: {{ include "release_namespace" . }}
  labels: {{- include "longhorn.labels" . | nindent 4 }}
  {{- with .Values.serviceAccount.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
 ---
 apiVersion: v1
 kind: ServiceAccount
 metadata:
  name: longhorn-ui-service-account
  namespace: {{ include "release_namespace" . }}
  labels: {{- include "longhorn.labels" . | nindent 4 }}
  {{- with .Values.serviceAccount.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
  {{- if .Values.openshift.enabled }}
  {{- if .Values.openshift.ui.route }}
  {{- if not .Values.serviceAccount.annotations }}
  annotations:
  {{- end }}
    serviceaccounts.openshift.io/oauth-redirectreference.primary: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"longhorn-ui"}}'
  {{- end }}
  {{- end }}
 ---
 apiVersion: v1
 kind: ServiceAccount
 metadata:
  name: longhorn-support-bundle
  namespace: {{ include "release_namespace" . }}
  labels: {{- include "longhorn.labels" . | nindent 4 }}
  {{- with .Values.serviceAccount.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
--- a/chart/templates/services.yaml
+++ b/chart/templates/services.yaml
@ -9,10 +9,10 @@ spec:
  type: ClusterIP
  sessionAffinity: ClientIP
  selector:
-    app: longhorn-conversion-webhook
+    app: longhorn-manager
  ports:
  - name: conversion-webhook
-    port: 9443
+    port: 9501
    targetPort: conversion-wh
 ---
 apiVersion: v1
@ -26,14 +26,31 @@ spec:
  type: ClusterIP
  sessionAffinity: ClientIP
  selector:
-    app: longhorn-admission-webhook
+    app: longhorn-manager
  ports:
  - name: admission-webhook
-    port: 9443
+    port: 9502
    targetPort: admission-wh
 ---
 apiVersion: v1
 kind: Service
 metadata:
  labels: {{- include "longhorn.labels" . | nindent 4 }}
    app: longhorn-recovery-backend
  name: longhorn-recovery-backend
  namespace: {{ include "release_namespace" . }}
 spec:
  type: ClusterIP
  sessionAffinity: ClientIP
  selector:
    app: longhorn-manager
  ports:
  - name: recovery-backend
    port: 9503
    targetPort: recov-backend
 ---
 apiVersion: v1
 kind: Service
 metadata:
  labels: {{- include "longhorn.labels" . | nindent 4 }}
  name: longhorn-engine-manager
--- a/chart/templates/storageclass.yaml
+++ b/chart/templates/storageclass.yaml
@ -23,6 +23,9 @@ data:
      {{- if .Values.persistence.defaultFsType }}
      fsType: "{{ .Values.persistence.defaultFsType }}"
      {{- end }}
      {{- if .Values.persistence.defaultMkfsParams }}
      mkfsParams: "{{ .Values.persistence.defaultMkfsParams }}"
      {{- end }}
      {{- if .Values.persistence.migratable }}
      migratable: "{{ .Values.persistence.migratable }}"
      {{- end }}    
@ -36,3 +39,6 @@ data:
      recurringJobSelector: '{{ .Values.persistence.recurringJobSelector.jobList }}'
      {{- end }}
      dataLocality: {{ .Values.persistence.defaultDataLocality | quote }}
      {{- if .Values.persistence.defaultNodeSelector.enable }}
      nodeSelector: "{{ .Values.persistence.defaultNodeSelector.selector }}"
      {{- end }}
--- a/chart/templates/uninstall-job.yaml
+++ b/chart/templates/uninstall-job.yaml
@ -3,7 +3,7 @@ kind: Job
 metadata:
  annotations:
    "helm.sh/hook": pre-delete
-    "helm.sh/hook-delete-policy": hook-succeeded
+    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
  name: longhorn-uninstall
  namespace: {{ include "release_namespace" . }}
  labels: {{- include "longhorn.labels" . | nindent 4 }}
@ -19,8 +19,6 @@ spec:
      - name: longhorn-uninstall
        image: {{ template "registry_url" . }}{{ .Values.image.longhorn.manager.repository }}:{{ .Values.image.longhorn.manager.tag }}
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        securityContext:
          privileged: true
        command:
        - longhorn-manager
        - uninstall
@ -30,7 +28,7 @@ spec:
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
-      restartPolicy: OnFailure
+      restartPolicy: Never
      {{- if .Values.privateRegistry.registrySecret }}
      imagePullSecrets:
      - name: {{ .Values.privateRegistry.registrySecret }}
--- a/chart/templates/validate-psp-install.yaml
+++ b/chart/templates/validate-psp-install.yaml
@ -0,0 +1,7 @@
 #{{- if gt (len (lookup "rbac.authorization.k8s.io/v1" "ClusterRole" "" "")) 0 -}}
 #{{- if .Values.enablePSP }}
 #{{- if not (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") }}
 #{{- fail "The target cluster does not have the PodSecurityPolicy API resource. Please disable PSPs in this chart before proceeding." -}}
 #{{- end }}
 #{{- end }}
 #{{- end }}
--- a/chart/values.yaml
+++ b/chart/values.yaml
@ -3,153 +3,350 @@
 # Declare variables to be passed into your templates.
 global:
  cattle:
    # -- System default registry
    systemDefaultRegistry: ""
    windowsCluster:
-      # Enable this to allow Longhorn to run on the Rancher deployed Windows cluster
+      # -- Enable this to allow Longhorn to run on the Rancher deployed Windows cluster
      enabled: false
-      # Tolerate Linux node taint
+      # -- Tolerate Linux nodes to run Longhorn user deployed components
      tolerations:
      - key: "cattle.io/os"
        value: "linux"
        effect: "NoSchedule"
        operator: "Equal"
-      # Select Linux nodes
+      # -- Select Linux nodes to run Longhorn user deployed components
      nodeSelector:
        kubernetes.io/os: "linux"
      # Recognize toleration and node selector for Longhorn run-time created components
      defaultSetting:
        # -- Toleration for Longhorn system managed components
        taintToleration: cattle.io/os=linux:NoSchedule
        # -- Node selector for Longhorn system managed components
        systemManagedComponentsNodeSelector: kubernetes.io/os:linux
 networkPolicies:
  # -- Enable NetworkPolicies to limit access to the Longhorn pods
  enabled: false
  # -- Create the policy based on your distribution to allow access for the ingress. Options: `k3s`, `rke2`, `rke1`
  type: "k3s"
 image:
  longhorn:
    engine:
      # -- Specify Longhorn engine image repository
      repository: longhornio/longhorn-engine
-      tag: v1.3.3-rc3
+      # -- Specify Longhorn engine image tag
      tag: master-head
    manager:
      # -- Specify Longhorn manager image repository
      repository: longhornio/longhorn-manager
-      tag: v1.3.3-rc3
+      # -- Specify Longhorn manager image tag
      tag: master-head
    ui:
      # -- Specify Longhorn ui image repository
      repository: longhornio/longhorn-ui
-      tag: v1.3.3-rc3
+      # -- Specify Longhorn ui image tag
      tag: master-head
    instanceManager:
      # -- Specify Longhorn instance manager image repository
      repository: longhornio/longhorn-instance-manager
-      tag: v1_20230407
+      # -- Specify Longhorn instance manager image tag
      tag: master-head
    shareManager:
      # -- Specify Longhorn share manager image repository
      repository: longhornio/longhorn-share-manager
-      tag: v1_20230320
+      # -- Specify Longhorn share manager image tag
      tag: master-head
    backingImageManager:
      # -- Specify Longhorn backing image manager image repository
      repository: longhornio/backing-image-manager
-      tag: v3_20230320
+      # -- Specify Longhorn backing image manager image tag 
      tag: master-head
    supportBundleKit:
      # -- Specify Longhorn support bundle manager image repository
      repository: longhornio/support-bundle-kit
      # -- Specify Longhorn support bundle manager image tag
      tag: v0.0.27
  csi:
    attacher:
      # -- Specify CSI attacher image repository. Leave blank to autodetect
      repository: longhornio/csi-attacher
-      tag: v3.4.0
+      # -- Specify CSI attacher image tag. Leave blank to autodetect
      tag: v4.2.0
    provisioner:
      # -- Specify CSI provisioner image repository. Leave blank to autodetect
      repository: longhornio/csi-provisioner
-      tag: v2.1.2
+      # -- Specify CSI provisioner image tag. Leave blank to autodetect
      tag: v3.4.1
    nodeDriverRegistrar:
      # -- Specify CSI node driver registrar image repository. Leave blank to autodetect
      repository: longhornio/csi-node-driver-registrar
-      tag: v2.5.0
+      # -- Specify CSI node driver registrar image tag. Leave blank to autodetect
      tag: v2.7.0
    resizer:
      # -- Specify CSI driver resizer image repository. Leave blank to autodetect
      repository: longhornio/csi-resizer
-      tag: v1.2.0
+      # -- Specify CSI driver resizer image tag. Leave blank to autodetect
      tag: v1.7.0
    snapshotter:
      # -- Specify CSI driver snapshotter image repository. Leave blank to autodetect
      repository: longhornio/csi-snapshotter
-      tag: v3.0.3
+      # -- Specify CSI driver snapshotter image tag. Leave blank to autodetect.
      tag: v6.2.1
    livenessProbe:
      # -- Specify CSI liveness probe image repository. Leave blank to autodetect 
      repository: longhornio/livenessprobe
-      tag: v2.8.0
+      # -- Specify CSI liveness probe image tag. Leave blank to autodetect
      tag: v2.9.0
  openshift:
    oauthProxy:
      # -- For openshift user. Specify oauth proxy image repository
      repository: quay.io/openshift/origin-oauth-proxy
      # -- For openshift user. Specify oauth proxy image tag. Note: Use your OCP/OKD 4.X Version, Current Stable is 4.14
      tag: 4.14
  # -- Image pull policy which applies to all user deployed Longhorn Components. e.g, Longhorn manager, Longhorn driver, Longhorn UI
  pullPolicy: IfNotPresent
 service:
  ui:
    # -- Define Longhorn UI service type. Options: `ClusterIP`, `NodePort`, `LoadBalancer`, `Rancher-Proxy`
    type: ClusterIP
    # -- NodePort port number (to set explicitly, choose port between 30000-32767)
    nodePort: null
  manager:
    # -- Define Longhorn manager service type.
    type: ClusterIP
    # -- NodePort port number (to set explicitly, choose port between 30000-32767)
    nodePort: ""
    loadBalancerIP: ""
    loadBalancerSourceRanges: ""
 persistence:
  # -- Set Longhorn StorageClass as default
  defaultClass: true
  # -- Set filesystem type for Longhorn StorageClass
  defaultFsType: ext4
  # -- Set mkfs options for Longhorn StorageClass
  defaultMkfsParams: ""
  # -- Set replica count for Longhorn StorageClass
  defaultClassReplicaCount: 3
-  defaultDataLocality: disabled # best-effort otherwise
+  # -- Set data locality for Longhorn StorageClass. Options: `disabled`, `best-effort`
  defaultDataLocality: disabled
  # -- Define reclaim policy. Options: `Retain`, `Delete`
  reclaimPolicy: Delete
  # -- Set volume migratable for Longhorn StorageClass
  migratable: false
  recurringJobSelector:
    # -- Enable recurring job selector for Longhorn StorageClass
    enable: false
    # -- Recurring job selector list for Longhorn StorageClass. Please be careful of quotes of input. e.g., `[{"name":"backup", "isGroup":true}]`
    jobList: []
  backingImage:
    # -- Set backing image for Longhorn StorageClass
    enable: false
    # -- Specify a backing image that will be used by Longhorn volumes in Longhorn StorageClass. If not exists, the backing image data source type and backing image data source parameters should be specified so that Longhorn will create the backing image before using it
    name: ~
    # -- Specify the data source type for the backing image used in Longhorn StorageClass.
    # If the backing image does not exists, Longhorn will use this field to create a backing image. Otherwise, Longhorn will use it to verify the selected backing image.
    dataSourceType: ~
    # -- Specify the data source parameters for the backing image used in Longhorn StorageClass. This option accepts a json string of a map. e.g., `'{\"url\":\"https://backing-image-example.s3-region.amazonaws.com/test-backing-image\"}'`.
    dataSourceParameters: ~
    # -- Specify the expected SHA512 checksum of the selected backing image in Longhorn StorageClass
    expectedChecksum: ~
  defaultNodeSelector:
    # -- Enable Node selector for Longhorn StorageClass
    enable: false
    # -- This selector enables only certain nodes having these tags to be used for the volume. e.g. `"storage,fast"`
    selector: ""
  # -- Allow automatically removing snapshots during filesystem trim for Longhorn StorageClass. Options: `ignored`, `enabled`, `disabled`
  removeSnapshotsDuringFilesystemTrim: ignored
 helmPreUpgradeCheckerJob:
  enabled: true
 csi:
  # -- Specify kubelet root-dir. Leave blank to autodetect
  kubeletRootDir: ~
  # -- Specify replica count of CSI Attacher. Leave blank to use default count: 3
  attacherReplicaCount: ~
  # -- Specify replica count of CSI Provisioner. Leave blank to use default count: 3
  provisionerReplicaCount: ~
  # -- Specify replica count of CSI Resizer. Leave blank to use default count: 3
  resizerReplicaCount: ~
  # -- Specify replica count of CSI Snapshotter. Leave blank to use default count: 3
  snapshotterReplicaCount: ~
 defaultSettings:
  # -- The endpoint used to access the backupstore. Available: NFS, CIFS, AWS, GCP, AZURE.
  backupTarget: ~
  # -- The name of the Kubernetes secret associated with the backup target.
  backupTargetCredentialSecret: ~
  # -- If this setting is enabled, Longhorn will automatically attaches the volume and takes snapshot/backup 
  # when it is the time to do recurring snapshot/backup.
  allowRecurringJobWhileVolumeDetached: ~
  # -- Create default Disk automatically only on Nodes with the label "node.longhorn.io/create-default-disk=true" if no other disks exist.
  # If disabled, the default disk will be created on all new nodes when each node is first added.
  createDefaultDiskLabeledNodes: ~
  # -- Default path to use for storing data on a host. By default "/var/lib/longhorn/"
  defaultDataPath: ~
  # -- Longhorn volume has data locality if there is a local replica of the volume on the same node as the pod which is using the volume.
  defaultDataLocality: ~
  # -- Allow scheduling on nodes with existing healthy replicas of the same volume. By default false.
  replicaSoftAntiAffinity: ~
  # -- Enable this setting automatically rebalances replicas when discovered an available node.
  replicaAutoBalance: ~
  # -- The over-provisioning percentage defines how much storage can be allocated relative to the hard drive's capacity. By default 200.
  storageOverProvisioningPercentage: ~
  # -- If the minimum available disk capacity exceeds the actual percentage of available disk capacity,
  # the disk becomes unschedulable until more space is freed up. By default 25.
  storageMinimalAvailablePercentage: ~
  # -- The reserved percentage specifies the percentage of disk space that will not be allocated to the default disk on each new Longhorn node.
  storageReservedPercentageForDefaultDisk: ~
  # -- Upgrade Checker will check for new Longhorn version periodically.
  # When there is a new version available, a notification will appear in the UI. By default true.
  upgradeChecker: ~
  # -- The default number of replicas when a volume is created from the Longhorn UI.
  # For Kubernetes configuration, update the `numberOfReplicas` in the StorageClass. By default 3.
  defaultReplicaCount: ~
  # -- The 'storageClassName' is given to PVs and PVCs that are created for an existing Longhorn volume. The StorageClass name can also be used as a label,
  # so it is possible to use a Longhorn StorageClass to bind a workload to an existing PV without creating a Kubernetes StorageClass object.
  # By default 'longhorn-static'.
  defaultLonghornStaticStorageClass: ~
  # -- In seconds. The backupstore poll interval determines how often Longhorn checks the backupstore for new backups.
  # Set to 0 to disable the polling. By default 300.
  backupstorePollInterval: ~
  # -- In minutes. This setting determines how long Longhorn will keep the backup resource that was failed. Set to 0 to disable the auto-deletion.
  failedBackupTTL: ~
  # -- Restore recurring jobs from the backup volume on the backup target and create recurring jobs if not exist during a backup restoration.
  restoreVolumeRecurringJobs: ~
  # -- This setting specifies how many successful backup or snapshot job histories should be retained. History will not be retained if the value is 0.
  recurringSuccessfulJobsHistoryLimit: ~
  # -- This setting specifies how many failed backup or snapshot job histories should be retained. History will not be retained if the value is 0.
  recurringFailedJobsHistoryLimit: ~
  # -- This setting specifies how many failed support bundles can exist in the cluster.
  # Set this value to **0** to have Longhorn automatically purge all failed support bundles.
  supportBundleFailedHistoryLimit: ~
  # -- taintToleration for longhorn system components
  taintToleration: ~
  # -- nodeSelector for longhorn system components
  systemManagedComponentsNodeSelector: ~
  # -- priorityClass for longhorn system componentss
  priorityClass: ~
  # -- If enabled, volumes will be automatically salvaged when all the replicas become faulty e.g. due to network disconnection.
  # Longhorn will try to figure out which replica(s) are usable, then use them for the volume. By default true.
  autoSalvage: ~
  # -- If enabled, Longhorn will automatically delete the workload pod that is managed by a controller (e.g. deployment, statefulset, daemonset, etc...) 
  # when Longhorn volume is detached unexpectedly (e.g. during Kubernetes upgrade, Docker reboot, or network disconnect).
  # By deleting the pod, its controller restarts the pod and Kubernetes handles volume reattachment and remount.
  autoDeletePodWhenVolumeDetachedUnexpectedly: ~
  # -- Disable Longhorn manager to schedule replica on Kubernetes cordoned node. By default true.
  disableSchedulingOnCordonedNode: ~
  # -- Allow scheduling new Replicas of Volume to the Nodes in the same Zone as existing healthy Replicas.
  # Nodes don't belong to any Zone will be treated as in the same Zone.
  # Notice that Longhorn relies on label `topology.kubernetes.io/zone=<Zone name of the node>` in the Kubernetes node object to identify the zone.
  # By default true.
  replicaZoneSoftAntiAffinity: ~
  # -- Allow scheduling on disks with existing healthy replicas of the same volume. By default true.
  replicaDiskSoftAntiAffinity: ~
  # -- Defines the Longhorn action when a Volume is stuck with a StatefulSet/Deployment Pod on a node that is down.
  nodeDownPodDeletionPolicy: ~
-  allowNodeDrainWithLastHealthyReplica: ~
+  # -- Define the policy to use when a node with the last healthy replica of a volume is drained.
-  nodeDrainPolicy : ~
+  nodeDrainPolicy: ~
-  mkfsExt4Parameters: ~
+  # -- In seconds. The interval determines how long Longhorn will wait at least in order to reuse the existing data on a failed replica
-  disableReplicaRebuild: ~
+  # rather than directly creating a new replica for a degraded volume.
  replicaReplenishmentWaitInterval: ~
  # -- This setting controls how many replicas on a node can be rebuilt simultaneously.
  concurrentReplicaRebuildPerNodeLimit: ~
  # -- This setting controls how many volumes on a node can restore the backup concurrently. Set the value to **0** to disable backup restore.
  concurrentVolumeBackupRestorePerNodeLimit: ~
  # -- This setting is only for volumes created by UI.
  # By default, this is false meaning there will be a reivision counter file to track every write to the volume.
  # During salvage recovering Longhorn will pick the replica with largest reivision counter as candidate to recover the whole volume.
  # If revision counter is disabled, Longhorn will not track every write to the volume.
  # During the salvage recovering, Longhorn will use the 'volume-head-xxx.img' file last modification time and
  # file size to pick the replica candidate to recover the whole volume.
  disableRevisionCounter: ~
  # -- This setting defines the Image Pull Policy of Longhorn system managed pod.
  # e.g. instance manager, engine image, CSI driver, etc.
  # The new Image Pull Policy will only apply after the system managed pods restart.
  systemManagedPodsImagePullPolicy: ~
  # -- This setting allows user to create and attach a volume that doesn't have all the replicas scheduled at the time of creation.
  allowVolumeCreationWithDegradedAvailability: ~
  # -- This setting enables Longhorn to automatically cleanup the system generated snapshot after replica rebuild is done.
  autoCleanupSystemGeneratedSnapshot: ~
  # -- This setting controls how Longhorn automatically upgrades volumes' engines to the new default engine image after upgrading Longhorn manager.
  # The value of this setting specifies the maximum number of engines per node that are allowed to upgrade to the default engine image at the same time.
  # If the value is 0, Longhorn will not automatically upgrade volumes' engines to default version.
  concurrentAutomaticEngineUpgradePerNodeLimit: ~
  # -- This interval in minutes determines how long Longhorn will wait before cleaning up the backing image file when there is no replica in the disk using it.
  backingImageCleanupWaitInterval: ~
  # -- This interval in seconds determines how long Longhorn will wait before re-downloading the backing image file
  # when all disk files of this backing image become failed or unknown.
  backingImageRecoveryWaitInterval: ~
-  guaranteedEngineManagerCPU: ~
+  # -- This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each instance manager Pod.
-  guaranteedReplicaManagerCPU: ~
+  # You can leave it with the default value, which is 12%.
  guaranteedInstanceManagerCPU: ~
  # -- Enabling this setting will notify Longhorn that the cluster is using Kubernetes Cluster Autoscaler.
  kubernetesClusterAutoscalerEnabled: ~
  # -- This setting allows Longhorn to delete the orphan resource and its corresponding orphaned data automatically like stale replicas.
  # Orphan resources on down or unknown nodes will not be cleaned up automatically.
  orphanAutoDeletion: ~
  # -- Longhorn uses the storage network for in-cluster data traffic. Leave this blank to use the Kubernetes cluster network.
  storageNetwork: ~
  # -- This flag is designed to prevent Longhorn from being accidentally uninstalled which will lead to data lost.
  deletingConfirmationFlag: ~
  # -- In seconds. The setting specifies the timeout between the engine and replica(s), and the value should be between 8 to 30 seconds.
  # The default value is 8 seconds.
  engineReplicaTimeout: ~
  # -- This setting allows users to enable or disable snapshot hashing and data integrity checking.
  snapshotDataIntegrity: ~
  # -- Hashing snapshot disk files impacts the performance of the system.
  # The immediate snapshot hashing and checking can be disabled to minimize the impact after creating a snapshot.
  snapshotDataIntegrityImmediateCheckAfterSnapshotCreation: ~
  # -- Unix-cron string format. The setting specifies when Longhorn checks the data integrity of snapshot disk files.
  snapshotDataIntegrityCronjob: ~
  # -- This setting allows Longhorn filesystem trim feature to automatically mark the latest snapshot and
  # its ancestors as removed and stops at the snapshot containing multiple children.
  removeSnapshotsDuringFilesystemTrim: ~
  # -- This feature supports the fast replica rebuilding.
  # It relies on the checksum of snapshot disk files, so setting the snapshot-data-integrity to **enable** or **fast-check** is a prerequisite.
  fastReplicaRebuildEnabled: ~
  # -- In seconds. The setting specifies the HTTP client timeout to the file sync server.
  replicaFileSyncHttpClientTimeout: ~
  # -- The log level Panic, Fatal, Error, Warn, Info, Debug, Trace used in longhorn manager. Default to Info.
  logLevel: ~
  # -- This setting allows users to specify backup compression method.
  backupCompressionMethod: ~
  # -- This setting controls how many worker threads per backup concurrently.
  backupConcurrentLimit: ~
  # -- This setting controls how many worker threads per restore concurrently.
  restoreConcurrentLimit: ~
  # -- This allows users to activate v2 data engine based on SPDK.
  # Currently, it is in the preview phase and should not be utilized in a production environment.
  v2DataEngine: ~
  # -- This setting allows users to enable the offline replica rebuilding for volumes using v2 data engine.
  offlineReplicaRebuilding: ~
  # -- Allow Scheduling Empty Node Selector Volumes To Any Node
  allowEmptyNodeSelectorVolume: ~
  # -- Allow Scheduling Empty Disk Selector Volumes To Any Disk
  allowEmptyDiskSelectorVolume: ~
 privateRegistry:
  # -- Set `true` to create a new private registry secret
  createSecret: ~
  # -- URL of private registry. Leave blank to apply system default registry
  registryUrl: ~
  # -- User used to authenticate to private registry
  registryUser: ~
  # -- Password used to authenticate to private registry
  registryPasswd: ~
  # -- If create a new private registry secret is true, create a Kubernetes secret with this name; else use the existing secret of this name. Use it to pull images from your private registry
  registrySecret: ~
 longhornManager:
  log:
-    ## Allowed values are `plain` or `json`.
+    # -- Options: `plain`, `json`
    format: plain
  # -- Priority class for longhorn manager
  priorityClass: ~
  # -- Tolerate nodes to run Longhorn manager
  tolerations: []
  ## If you want to set tolerations for Longhorn Manager DaemonSet, delete the `[]` in the line above
  ## and uncomment this example block
@ -157,11 +354,13 @@ longhornManager:
  #   operator: "Equal"
  #   value: "value"
  #   effect: "NoSchedule"
  # -- Select nodes to run Longhorn manager
  nodeSelector: {}
  ## If you want to set node selector for Longhorn Manager DaemonSet, delete the `{}` in the line above
  ## and uncomment this example block
  #  label-key1: "label-value1"
  #  label-key2: "label-value2"
  # -- Annotation used in Longhorn manager service
  serviceAnnotations: {}
  ## If you want to set annotations for the Longhorn Manager service, delete the `{}` in the line above
  ## and uncomment this example block
@ -169,7 +368,9 @@ longhornManager:
  #  annotation-key2: "annotation-value2"
 longhornDriver:
  # -- Priority class for longhorn driver
  priorityClass: ~
  # -- Tolerate nodes to run Longhorn driver
  tolerations: []
  ## If you want to set tolerations for Longhorn Driver Deployer Deployment, delete the `[]` in the line above
  ## and uncomment this example block
@ -177,6 +378,7 @@ longhornDriver:
  #   operator: "Equal"
  #   value: "value"
  #   effect: "NoSchedule"
  # -- Select nodes to run Longhorn driver
  nodeSelector: {}
  ## If you want to set node selector for Longhorn Driver Deployer Deployment, delete the `{}` in the line above
  ## and uncomment this example block
@ -184,8 +386,11 @@ longhornDriver:
  #  label-key2: "label-value2"
 longhornUI:
  # -- Replica count for longhorn ui
  replicas: 2
  # -- Priority class count for longhorn ui
  priorityClass: ~
  # -- Tolerate nodes to run Longhorn UI
  tolerations: []
  ## If you want to set tolerations for Longhorn UI Deployment, delete the `[]` in the line above
  ## and uncomment this example block
@ -193,6 +398,7 @@ longhornUI:
  #   operator: "Equal"
  #   value: "value"
  #   effect: "NoSchedule"
  # -- Select nodes to run Longhorn UI
  nodeSelector: {}
  ## If you want to set node selector for Longhorn UI Deployment, delete the `{}` in the line above
  ## and uncomment this example block
@ -200,29 +406,29 @@ longhornUI:
  #  label-key2: "label-value2"
 ingress:
-  ## Set to true to enable ingress record generation
+  # -- Set to true to enable ingress record generation
  enabled: false
-  ## Add ingressClassName to the Ingress
+  # -- Add ingressClassName to the Ingress
-  ## Can replace the kubernetes.io/ingress.class annotation on v1.18+
+  # Can replace the kubernetes.io/ingress.class annotation on v1.18+
  ingressClassName: ~
  # -- Layer 7 Load Balancer hostname
  host: sslip.io
-  ## Set this to true in order to enable TLS on the ingress record
+  # -- Set this to true in order to enable TLS on the ingress record
  tls: false
-  ## Enable this in order to enable that the backend service will be connected at port 443
+  # -- Enable this in order to enable that the backend service will be connected at port 443
  secureBackends: false
-  ## If TLS is set to true, you must declare what secret will store the key/certificate for TLS
+  # -- If TLS is set to true, you must declare what secret will store the key/certificate for TLS
  tlsSecret: longhorn.local-tls
-  ## If ingress is enabled you can set the default ingress path
+  # -- If ingress is enabled you can set the default ingress path
-  ## then you can access the UI by using the following full path {{host}}+{{path}}
+  # then you can access the UI by using the following full path {{host}}+{{path}}
  path: /
  ## Ingress annotations done as key:value pairs
  ## If you're using kube-lego, you will want to add:
  ## kubernetes.io/tls-acme: true
  ##
@ -230,10 +436,12 @@ ingress:
  ## ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/annotations.md
  ##
  ## If tls is set to true, annotation ingress.kubernetes.io/secure-backends: "true" will automatically be set
  # -- Ingress annotations done as key:value pairs
  annotations:
  #  kubernetes.io/ingress.class: nginx
  #  kubernetes.io/tls-acme: true
  # -- If you're providing your own certificates, please use this to add the certificates as secrets
  secrets:
  ## If you're providing your own certificates, please use this to add the certificates as secrets
  ## key and certificate should start with -----BEGIN CERTIFICATE----- or
@ -248,16 +456,25 @@ ingress:
  #   key:
  #   certificate:
-# Configure a pod security policy in the Longhorn namespace to allow privileged pods
+# -- For Kubernetes < v1.25, if your cluster enables Pod Security Policy admission controller,
-enablePSP: true
+# set this to `true` to ship longhorn-psp which allow privileged Longhorn pods to start
 enablePSP: false
-## Specify override namespace, specifically this is useful for using longhorn as sub-chart
+# -- Annotations to add to the Longhorn Manager DaemonSet Pods. Optional.
 ## and its release namespace is not the `longhorn-system`
 namespaceOverride: ""
 # Annotations to add to the Longhorn Manager DaemonSet Pods. Optional.
 annotations: {}
 serviceAccount:
-  # Annotations to add to the service account
+  # -- Annotations to add to the service account
  annotations: {}
 ## openshift settings
 openshift:
  # -- Enable when using openshift
  enabled: false
  ui:
    # -- UI route in openshift environment
    route: "longhorn-ui"
    # -- UI port in openshift environment
    port: 443
    # -- UI proxy in openshift environment
    proxy: 8443
--- a/deploy/backupstores/azurite-backupstore.yaml
+++ b/deploy/backupstores/azurite-backupstore.yaml
@ -0,0 +1,48 @@
 # same secret for longhorn-system namespace
 apiVersion: v1
 kind: Secret
 metadata:
  name: azblob-secret
  namespace: longhorn-system
 type: Opaque
 data:
  AZBLOB_ACCOUNT_NAME: ZGV2c3RvcmVhY2NvdW50MQ==
  AZBLOB_ACCOUNT_KEY: RWJ5OHZkTTAyeE5PY3FGbHFVd0pQTGxtRXRsQ0RYSjFPVXpGVDUwdVNSWjZJRnN1RnEyVVZFckN6NEk2dHEvSzFTWkZQVE90ci9LQkhCZWtzb0dNR3c9PQ==
  AZBLOB_ENDPOINT: aHR0cDovL2F6YmxvYi1zZXJ2aWNlLmRlZmF1bHQ6MTAwMDAv
 ---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  name: longhorn-test-azblob
  namespace: default
  labels:
    app: longhorn-test-azblob
 spec:
  replicas: 1
  selector:
    matchLabels:
      app: longhorn-test-azblob
  template:
    metadata:
      labels:
        app: longhorn-test-azblob
    spec:
      containers:
      - name: azurite
        image: mcr.microsoft.com/azure-storage/azurite:3.23.0
        ports:
        - containerPort: 10000
 ---
 apiVersion: v1
 kind: Service
 metadata:
  name: azblob-service
  namespace: default
 spec:
  selector:
    app: longhorn-test-azblob
  ports:
    - port: 10000
      targetPort: 10000
      protocol: TCP
  sessionAffinity: ClientIP
--- a/deploy/backupstores/cifs-backupstore.yaml
+++ b/deploy/backupstores/cifs-backupstore.yaml
@ -0,0 +1,87 @@
 apiVersion: v1
 kind: Secret
 metadata:
  name: cifs-secret
  namespace: longhorn-system
 type: Opaque
 data:
  CIFS_USERNAME: bG9uZ2hvcm4tY2lmcy11c2VybmFtZQ== # longhorn-cifs-username
  CIFS_PASSWORD: bG9uZ2hvcm4tY2lmcy1wYXNzd29yZA== # longhorn-cifs-password
 ---
 apiVersion: v1
 kind: Secret
 metadata:
  name: cifs-secret
  namespace: default
 type: Opaque
 data:
  CIFS_USERNAME: bG9uZ2hvcm4tY2lmcy11c2VybmFtZQ== # longhorn-cifs-username
  CIFS_PASSWORD: bG9uZ2hvcm4tY2lmcy1wYXNzd29yZA== # longhorn-cifs-password
 ---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  name: longhorn-test-cifs
  namespace: default
  labels:
    app: longhorn-test-cifs
 spec:
  replicas: 1
  selector:
    matchLabels:
      app: longhorn-test-cifs
  template:
    metadata:
      labels:
        app: longhorn-test-cifs
    spec:
      volumes:
      - name: cifs-volume
        emptyDir: {}
      containers:
      - name: longhorn-test-cifs-container
        image: derekbit/samba:latest
        ports:
        - containerPort: 139
        - containerPort: 445
        imagePullPolicy: Always
        env:
        - name: EXPORT_PATH
          value: /opt/backupstore
        - name: CIFS_DISK_IMAGE_SIZE_MB
          value: "4096"
        - name: CIFS_USERNAME
          valueFrom:
            secretKeyRef:
              name: cifs-secret
              key: CIFS_USERNAME
        - name: CIFS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: cifs-secret
              key: CIFS_PASSWORD
        securityContext:
          privileged: true
          capabilities:
            add: ["SYS_ADMIN", "DAC_READ_SEARCH"]
        volumeMounts:
        - name: cifs-volume
          mountPath: "/opt/backupstore"
        args: ["-u", "$(CIFS_USERNAME);$(CIFS_PASSWORD)", "-s", "backupstore;$(EXPORT_PATH);yes;no;no;all;none"]
 ---
 kind: Service
 apiVersion: v1
 metadata:
  name: longhorn-test-cifs-svc
  namespace: default
 spec:
  selector:
    app: longhorn-test-cifs
  clusterIP: None
  ports:
  - name: netbios-port
    port: 139
    targetPort: 139
  - name: microsoft-port
    port: 445
    targetPort: 445
--- a/deploy/backupstores/minio-backupstore.yaml
+++ b/deploy/backupstores/minio-backupstore.yaml
@ -24,49 +24,57 @@ data:
  AWS_ENDPOINTS: aHR0cHM6Ly9taW5pby1zZXJ2aWNlLmRlZmF1bHQ6OTAwMA== # https://minio-service.default:9000
  AWS_CERT: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURMRENDQWhTZ0F3SUJBZ0lSQU1kbzQycGhUZXlrMTcvYkxyWjVZRHN3RFFZSktvWklodmNOQVFFTEJRQXcKR2pFWU1CWUdBMVVFQ2hNUFRHOXVaMmh2Y200Z0xTQlVaWE4wTUNBWERUSXdNRFF5TnpJek1EQXhNVm9ZRHpJeApNakF3TkRBek1qTXdNREV4V2pBYU1SZ3dGZ1lEVlFRS0V3OU1iMjVuYUc5eWJpQXRJRlJsYzNRd2dnRWlNQTBHCkNTcUdTSWIzRFFFQkFRVUFBNElCRHdBd2dnRUtBb0lCQVFEWHpVdXJnUFpEZ3pUM0RZdWFlYmdld3Fvd2RlQUQKODRWWWF6ZlN1USs3K21Oa2lpUVBvelVVMmZvUWFGL1BxekJiUW1lZ29hT3l5NVhqM1VFeG1GcmV0eDBaRjVOVgpKTi85ZWFJNWRXRk9teHhpMElPUGI2T0RpbE1qcXVEbUVPSXljdjRTaCsvSWo5Zk1nS0tXUDdJZGxDNUJPeThkCncwOVdkckxxaE9WY3BKamNxYjN6K3hISHd5Q05YeGhoRm9tb2xQVnpJbnlUUEJTZkRuSDBuS0lHUXl2bGhCMGsKVHBHSzYxc2prZnFTK3hpNTlJeHVrbHZIRXNQcjFXblRzYU9oaVh6N3lQSlorcTNBMWZoVzBVa1JaRFlnWnNFbQovZ05KM3JwOFhZdURna2kzZ0UrOElXQWRBWHExeWhqRDdSSkI4VFNJYTV0SGpKUUtqZ0NlSG5HekFnTUJBQUdqCmF6QnBNQTRHQTFVZER3RUIvd1FFQXdJQ3BEQVRCZ05WSFNVRUREQUtCZ2dyQmdFRkJRY0RBVEFQQmdOVkhSTUIKQWY4RUJUQURBUUgvTURFR0ExVWRFUVFxTUNpQ0NXeHZZMkZzYUc5emRJSVZiV2x1YVc4dGMyVnlkbWxqWlM1awpaV1poZFd4MGh3Ui9BQUFCTUEwR0NTcUdTSWIzRFFFQkN3VUFBNElCQVFDbUZMMzlNSHVZMzFhMTFEajRwMjVjCnFQRUM0RHZJUWozTk9kU0dWMmQrZjZzZ3pGejFXTDhWcnF2QjFCMVM2cjRKYjJQRXVJQkQ4NFlwVXJIT1JNU2MKd3ViTEppSEtEa0Jmb2U5QWI1cC9VakpyS0tuajM0RGx2c1cvR3AwWTZYc1BWaVdpVWorb1JLbUdWSTI0Q0JIdgpnK0JtVzNDeU5RR1RLajk0eE02czNBV2xHRW95YXFXUGU1eHllVWUzZjFBWkY5N3RDaklKUmVWbENtaENGK0JtCmFUY1RSUWN3cVdvQ3AwYmJZcHlERFlwUmxxOEdQbElFOW8yWjZBc05mTHJVcGFtZ3FYMmtYa2gxa3lzSlEralAKelFadHJSMG1tdHVyM0RuRW0yYmk0TktIQVFIcFc5TXUxNkdRakUxTmJYcVF0VEI4OGpLNzZjdEg5MzRDYWw2VgotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
 ---
-apiVersion: v1
+apiVersion: apps/v1
-kind: Pod
+kind: Deployment
 metadata:
  name: longhorn-test-minio
  namespace: default
  labels:
    app: longhorn-test-minio
 spec:
-  volumes:
+  replicas: 1
-  - name: minio-volume
+  selector:
-    emptyDir: {}
+    matchLabels:
-  - name: minio-certificates
+      app: longhorn-test-minio
-    secret:
+  template:
-      secretName: minio-secret
+    metadata:
-      items:
+      labels:
-      - key: AWS_CERT
+        app: longhorn-test-minio
-        path: public.crt
+    spec:
-      - key: AWS_CERT_KEY
+      volumes:
-        path: private.key
+      - name: minio-volume
-
+        emptyDir: {}
-  containers:
+      - name: minio-certificates
-  - name: minio
+        secret:
-    image: minio/minio:RELEASE.2022-02-01T18-00-14Z
+          secretName: minio-secret
-    command: ["sh", "-c", "mkdir -p /storage/backupbucket && mkdir -p /root/.minio/certs && ln -s /root/certs/private.key /root/.minio/certs/private.key && ln -s /root/certs/public.crt /root/.minio/certs/public.crt && exec minio server /storage"]
+          items:
-    env:
+          - key: AWS_CERT
-    - name: MINIO_ROOT_USER
+            path: public.crt
-      valueFrom:
+          - key: AWS_CERT_KEY
-        secretKeyRef:
+            path: private.key
-          name: minio-secret
+      containers:
-          key: AWS_ACCESS_KEY_ID
+      - name: minio
-    - name: MINIO_ROOT_PASSWORD
+        image: minio/minio:RELEASE.2022-02-01T18-00-14Z
-      valueFrom:
+        command: ["sh", "-c", "mkdir -p /storage/backupbucket && mkdir -p /root/.minio/certs && ln -s /root/certs/private.key /root/.minio/certs/private.key && ln -s /root/certs/public.crt /root/.minio/certs/public.crt && exec minio server /storage"]
-        secretKeyRef:
+        env:
-          name: minio-secret
+        - name: MINIO_ROOT_USER
-          key: AWS_SECRET_ACCESS_KEY
+          valueFrom:
-    ports:
+            secretKeyRef:
-    - containerPort: 9000
+              name: minio-secret
-    volumeMounts:
+              key: AWS_ACCESS_KEY_ID
-    - name: minio-volume
+        - name: MINIO_ROOT_PASSWORD
-      mountPath: "/storage"
+          valueFrom:
-    - name: minio-certificates
+            secretKeyRef:
-      mountPath: "/root/certs"
+              name: minio-secret
-      readOnly: true
+              key: AWS_SECRET_ACCESS_KEY
        ports:
        - containerPort: 9000
        volumeMounts:
        - name: minio-volume
          mountPath: "/storage"
        - name: minio-certificates
          mountPath: "/root/certs"
          readOnly: true
 ---
 apiVersion: v1
 kind: Service
--- a/deploy/backupstores/nfs-backupstore.yaml
+++ b/deploy/backupstores/nfs-backupstore.yaml
@ -1,41 +1,49 @@
-apiVersion: v1
+apiVersion: apps/v1
-kind: Pod
+kind: Deployment
 metadata:
  name: longhorn-test-nfs
  namespace: default
  labels:
    app: longhorn-test-nfs
 spec:
-  volumes:
+  selector:
-  - name: nfs-volume
+    matchLabels:
-    emptyDir: {}
+      app: longhorn-test-nfs
-  containers:
+  template:
-  - name: longhorn-test-nfs-container
+    metadata:
-    image: longhornio/nfs-ganesha:latest
+      labels:
-    imagePullPolicy: Always
+        app: longhorn-test-nfs
-    env:
+    spec:
-    - name: EXPORT_ID
+      volumes:
-      value: "14"
+      - name: nfs-volume
-    - name: EXPORT_PATH
+        emptyDir: {}
-      value: /opt/backupstore
+      containers:
-    - name: PSEUDO_PATH
+      - name: longhorn-test-nfs-container
-      value: /opt/backupstore
+        image: longhornio/nfs-ganesha:latest
-    - name: NFS_DISK_IMAGE_SIZE_MB
+        imagePullPolicy: Always
-      value: "4096"
+        env:
-    command: ["bash", "-c", "chmod 700 /opt/backupstore && /opt/start_nfs.sh | tee /var/log/ganesha.log"]
+        - name: EXPORT_ID
-    securityContext:
+          value: "14"
-      privileged: true
+        - name: EXPORT_PATH
-      capabilities:
+          value: /opt/backupstore
-        add: ["SYS_ADMIN", "DAC_READ_SEARCH"]
+        - name: PSEUDO_PATH
-    volumeMounts:
+          value: /opt/backupstore
-    - name: nfs-volume
+        - name: NFS_DISK_IMAGE_SIZE_MB
-      mountPath: "/opt/backupstore"
+          value: "4096"
-    livenessProbe:
+        command: ["bash", "-c", "chmod 700 /opt/backupstore && /opt/start_nfs.sh | tee /var/log/ganesha.log"]
-      exec:
+        securityContext:
-        command: ["bash", "-c", "grep \"No export entries found\" /var/log/ganesha.log > /dev/null 2>&1 ; [ $? -ne 0 ]"]
+          privileged: true
-      initialDelaySeconds: 5
+          capabilities:
-      periodSeconds: 5
+            add: ["SYS_ADMIN", "DAC_READ_SEARCH"]
-      timeoutSeconds: 4
+        volumeMounts:
        - name: nfs-volume
          mountPath: "/opt/backupstore"
        livenessProbe:
          exec:
            command: ["bash", "-c", "grep \"No export entries found\" /var/log/ganesha.log > /dev/null 2>&1 ; [ $? -ne 0 ]"]
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 4
 ---
 kind: Service
 apiVersion: v1
--- a/deploy/longhorn-images.txt
+++ b/deploy/longhorn-images.txt
@ -1,12 +1,13 @@
-longhornio/csi-attacher:v3.4.0
+longhornio/csi-attacher:v4.2.0
-longhornio/csi-provisioner:v2.1.2
+longhornio/csi-provisioner:v3.4.1
-longhornio/csi-resizer:v1.2.0
+longhornio/csi-resizer:v1.7.0
-longhornio/csi-snapshotter:v3.0.3
+longhornio/csi-snapshotter:v6.2.1
-longhornio/csi-node-driver-registrar:v2.5.0
+longhornio/csi-node-driver-registrar:v2.7.0
-longhornio/livenessprobe:v2.8.0
+longhornio/livenessprobe:v2.9.0
-longhornio/backing-image-manager:v3_20230320
+longhornio/backing-image-manager:master-head
-longhornio/longhorn-engine:v1.3.3-rc3
+longhornio/longhorn-engine:master-head
-longhornio/longhorn-instance-manager:v1_20230407
+longhornio/longhorn-instance-manager:master-head
-longhornio/longhorn-manager:v1.3.3-rc3
+longhornio/longhorn-manager:master-head
-longhornio/longhorn-share-manager:v1_20230320
+longhornio/longhorn-share-manager:master-head
-longhornio/longhorn-ui:v1.3.3-rc3
+longhornio/longhorn-ui:master-head
 longhornio/support-bundle-kit:v0.0.27
--- a/deploy/longhorn.yaml
+++ b/deploy/longhorn.yaml
--- a/deploy/podsecuritypolicy.yaml
+++ b/deploy/podsecuritypolicy.yaml
@ -0,0 +1,61 @@
 apiVersion: policy/v1beta1
 kind: PodSecurityPolicy
 metadata:
  name: longhorn-psp
 spec:
  privileged: true
  allowPrivilegeEscalation: true
  requiredDropCapabilities:
    - NET_RAW
  allowedCapabilities:
    - SYS_ADMIN
  hostNetwork: false
  hostIPC: false
  hostPID: true
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
    - configMap
    - downwardAPI
    - emptyDir
    - secret
    - projected
    - hostPath
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: Role
 metadata:
  name: longhorn-psp-role
  namespace: longhorn-system
 rules:
  - apiGroups:
      - policy
    resources:
      - podsecuritypolicies
    verbs:
      - use
    resourceNames:
      - longhorn-psp
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: RoleBinding
 metadata:
  name: longhorn-psp-binding
  namespace: longhorn-system
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: longhorn-psp-role
 subjects:
  - kind: ServiceAccount
    name: longhorn-service-account
    namespace: longhorn-system
  - kind: ServiceAccount
    name: default
    namespace: longhorn-system
--- a/deploy/prerequisite/longhorn-cifs-installation.yaml
+++ b/deploy/prerequisite/longhorn-cifs-installation.yaml
@ -0,0 +1,36 @@
 apiVersion: apps/v1
 kind: DaemonSet
 metadata:
  name: longhorn-cifs-installation
  labels:
    app: longhorn-cifs-installation
  annotations:
    command: &cmd OS=$(grep -E "^ID_LIKE=" /etc/os-release | cut -d '=' -f 2); if [[ -z "${OS}" ]]; then OS=$(grep -E "^ID=" /etc/os-release | cut -d '=' -f 2); fi; if [[ "${OS}" == *"debian"* ]]; then sudo apt-get update -q -y && sudo apt-get install -q -y cifs-utils; elif [[ "${OS}" == *"suse"* ]]; then sudo zypper --gpg-auto-import-keys -q refresh && sudo zypper --gpg-auto-import-keys -q install -y cifs-utils; else sudo yum makecache -q -y && sudo yum --setopt=tsflags=noscripts install -q -y cifs-utils; fi && if [ $? -eq 0 ]; then echo "cifs install successfully"; else echo "cifs utilities install failed error code $?"; fi
 spec:
  selector:
    matchLabels:
      app: longhorn-cifs-installation
  template:
    metadata:
      labels:
        app: longhorn-cifs-installation
    spec:
      hostNetwork: true
      hostPID: true
      initContainers:
      - name: cifs-installation
        command:
          - nsenter
          - --mount=/proc/1/ns/mnt
          - --
          - bash
          - -c
          - *cmd
        image: alpine:3.12
        securityContext:
          privileged: true
      containers:
      - name: sleep
        image: registry.k8s.io/pause:3.1
  updateStrategy:
    type: RollingUpdate
--- a/deploy/prerequisite/longhorn-iscsi-selinux-workaround.yaml
+++ b/deploy/prerequisite/longhorn-iscsi-selinux-workaround.yaml
@ -0,0 +1,35 @@
 apiVersion: apps/v1
 kind: DaemonSet
 metadata:
  name: longhorn-iscsi-selinux-workaround
  labels:
    app: longhorn-iscsi-selinux-workaround
  annotations:
    command: &cmd if ! rpm -q policycoreutils > /dev/null 2>&1; then echo "failed to apply workaround; only applicable in Fedora based distros with SELinux enabled"; exit; elif cd /tmp && echo '(allow iscsid_t self (capability (dac_override)))' > local_longhorn.cil && semodule -vi local_longhorn.cil && rm -f local_longhorn.cil; then echo "applied workaround successfully"; else echo "failed to apply workaround; error code $?"; fi
 spec:
  selector:
    matchLabels:
      app: longhorn-iscsi-selinux-workaround
  template:
    metadata:
      labels:
        app: longhorn-iscsi-selinux-workaround
    spec:
      hostPID: true
      initContainers:
      - name: iscsi-selinux-workaround
        command:
          - nsenter
          - --mount=/proc/1/ns/mnt
          - --
          - bash
          - -c
          - *cmd
        image: alpine:3.17
        securityContext:
          privileged: true
      containers:
      - name: sleep
        image: registry.k8s.io/pause:3.1
  updateStrategy:
    type: RollingUpdate
--- a/deploy/prerequisite/longhorn-nvme-cli-installation.yaml
+++ b/deploy/prerequisite/longhorn-nvme-cli-installation.yaml
@ -0,0 +1,36 @@
 apiVersion: apps/v1
 kind: DaemonSet
 metadata:
  name: longhorn-nvme-cli-installation
  labels:
    app: longhorn-nvme-cli-installation
  annotations:
    command: &cmd OS=$(grep -E "^ID_LIKE=" /etc/os-release | cut -d '=' -f 2); if [[ -z "${OS}" ]]; then OS=$(grep -E "^ID=" /etc/os-release | cut -d '=' -f 2); fi; if [[ "${OS}" == *"debian"* ]]; then sudo apt-get update -q -y && sudo apt-get install -q -y nvme-cli && sudo modprobe nvme-tcp; elif [[ "${OS}" == *"suse"* ]]; then sudo zypper --gpg-auto-import-keys -q refresh && sudo zypper --gpg-auto-import-keys -q install -y nvme-cli && sudo modprobe nvme-tcp; else sudo yum makecache -q -y && sudo yum --setopt=tsflags=noscripts install -q -y nvme-cli && sudo modprobe nvme-tcp; fi && if [ $? -eq 0 ]; then echo "nvme-cli install successfully"; else echo "nvme-cli install failed error code $?"; fi
 spec:
  selector:
    matchLabels:
      app: longhorn-nvme-cli-installation
  template:
    metadata:
      labels:
        app: longhorn-nvme-cli-installation
    spec:
      hostNetwork: true
      hostPID: true
      initContainers:
      - name: nvme-cli-installation
        command:
          - nsenter
          - --mount=/proc/1/ns/mnt
          - --
          - bash
          - -c
          - *cmd
        image: alpine:3.12
        securityContext:
          privileged: true
      containers:
      - name: sleep
        image: registry.k8s.io/pause:3.1
  updateStrategy:
    type: RollingUpdate
--- a/deploy/prerequisite/longhorn-spdk-setup.yaml
+++ b/deploy/prerequisite/longhorn-spdk-setup.yaml
@ -0,0 +1,47 @@
 apiVersion: apps/v1
 kind: DaemonSet
 metadata:
  name: longhorn-spdk-setup
  labels:
    app: longhorn-spdk-setup
  annotations:
    command: &cmd OS=$(grep -E "^ID_LIKE=" /etc/os-release | cut -d '=' -f 2); if [[ -z "${OS}" ]]; then OS=$(grep -E "^ID=" /etc/os-release | cut -d '=' -f 2); fi; if [[ "${OS}" == *"debian"* ]]; then sudo apt-get update -q -y && sudo apt-get install -q -y git; elif [[ "${OS}" == *"suse"* ]]; then sudo zypper --gpg-auto-import-keys -q refresh && sudo zypper --gpg-auto-import-keys -q install -y git; else sudo yum makecache -q -y && sudo yum --setopt=tsflags=noscripts install -q -y git; fi && if [ $? -eq 0 ]; then echo "git install successfully"; else echo "git install failed error code $?"; fi && rm -rf ${SPDK_DIR}; git clone -b longhorn https://github.com/longhorn/spdk.git ${SPDK_DIR} && bash ${SPDK_DIR}/scripts/setup.sh ${SPDK_OPTION}; if [ $? -eq 0 ]; then echo "vm.nr_hugepages=$((HUGEMEM/2))" >> /etc/sysctl.conf; echo "SPDK environment is configured successfully"; else echo "Failed to configure SPDK environment error code $?"; fi; rm -rf ${SPDK_DIR}
 spec:
  selector:
    matchLabels:
      app: longhorn-spdk-setup
  template:
    metadata:
      labels:
        app: longhorn-spdk-setup
    spec:
      hostNetwork: true
      hostPID: true
      initContainers:
      - name: longhorn-spdk-setup
        command:
          - nsenter
          - --mount=/proc/1/ns/mnt
          - --
          - bash
          - -c
          - *cmd
        image: alpine:3.12
        env:
        - name: SPDK_DIR
          value: "/tmp/spdk"
        - name: SPDK_OPTION
          value: ""
        - name: HUGEMEM
          value: "1024"
        - name: PCI_ALLOWED
          value: "none"
        - name: DRIVER_OVERRIDE
          value: "uio_pci_generic"
        securityContext:
          privileged: true
      containers:
      - name: sleep
        image: registry.k8s.io/pause:3.1
  updateStrategy:
    type: RollingUpdate
--- a/deploy/upgrade_responder_server/README.md
+++ b/deploy/upgrade_responder_server/README.md
@ -0,0 +1,7 @@
 # Upgrade Responder Helm Chart
 This directory contains the helm values for the Longhorn upgrade responder server.
 The values are in the file `./chart-values.yaml`.
 When you update the content of `./chart-values.yaml`, automation pipeline will update the Longhorn upgrade responder.
 Information about the source chart is in `chart.yaml`.
 See [dev/upgrade-responder](../../dev/upgrade-responder/README.md) for manual deployment steps.
--- a/deploy/upgrade_responder_server/chart-values.yaml
+++ b/deploy/upgrade_responder_server/chart-values.yaml
@ -0,0 +1,372 @@
 # Specify the name of the application that is using this Upgrade Responder server
 # This will be used to create a database named <application-name>_upgrade_responder
 # in the InfluxDB to store all data for this Upgrade Responder
 # The name must be in snake case format
 applicationName: longhorn
 image:
  repository: longhornio/upgrade-responder
  tag: longhorn-head
  pullPolicy: Always
 secret:
  name: upgrade-responder-secret
  # Set this to false if you don't want to manage these secrets with helm
  managed: false
 resources:
  limits:
    cpu: 400m
    memory: 512Mi
  requests:
    cpu: 200m
    memory: 256Mi
 # This configmap contains information about the latest release
 # of the application that is using this Upgrade Responder
 configMap:
  responseConfig: |-
    {
      "versions": [
        {
          "name": "v1.3.3",
          "releaseDate": "2023-04-19T00:00:00Z",
          "tags": [
            "stable"
          ]
        },
        {
          "name": "v1.4.3",
          "releaseDate": "2023-07-14T00:00:00Z",
          "tags": [
            "latest", 
            "stable"
          ]
        },
        {
          "name": "v1.5.1",
          "releaseDate": "2023-07-19T00:00:00Z",
          "tags": [
            "latest"
          ]
        }
      ]
    }
  requestSchema: |-
    {
      "appVersionSchema": {
        "dataType": "string",
        "maxLen": 200
      },
      "extraTagInfoSchema": {
        "hostKernelRelease": {
          "dataType": "string",
          "maxLen": 200
        },
        "hostOsDistro": {
          "dataType": "string",
          "maxLen": 200
        },
        "kubernetesNodeProvider": {
          "dataType": "string",
          "maxLen": 200
        },
        "kubernetesVersion": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAllowRecurringJobWhileVolumeDetached": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAllowVolumeCreationWithDegradedAvailability": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAutoCleanupSystemGeneratedSnapshot": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAutoDeletePodWhenVolumeDetachedUnexpectedly": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAutoSalvage": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingBackupCompressionMethod": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingBackupTarget": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingCrdApiVersion": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingCreateDefaultDiskLabeledNodes": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingDefaultDataLocality": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingDisableRevisionCounter": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingDisableSchedulingOnCordonedNode": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingFastReplicaRebuildEnabled": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingKubernetesClusterAutoscalerEnabled": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingNodeDownPodDeletionPolicy": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingNodeDrainPolicy": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingOfflineReplicaRebuilding": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingOrphanAutoDeletion": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingPriorityClass": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingRegistrySecret": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingRemoveSnapshotsDuringFilesystemTrim": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaAutoBalance": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaSoftAntiAffinity": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaZoneSoftAntiAffinity": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaDiskSoftAntiAffinity": {
          "dataType": "string",
          "maxLen": 200
        }
        "longhornSettingRestoreVolumeRecurringJobs": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSnapshotDataIntegrity": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSnapshotDataIntegrityCronjob": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSnapshotDataIntegrityImmediateCheckAfterSnapshotCreation": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingStorageNetwork": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSystemManagedComponentsNodeSelector": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSystemManagedPodsImagePullPolicy": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingTaintToleration": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingV2DataEngine": {
          "dataType": "string",
          "maxLen": 200
        }
      },
      "extraFieldInfoSchema": {
        "longhornInstanceManagerAverageCpuUsageMilliCores": {
          "dataType": "float"
        },
        "longhornInstanceManagerAverageMemoryUsageBytes": {
          "dataType": "float"
        },
        "longhornManagerAverageCpuUsageMilliCores": {
          "dataType": "float"
        },
        "longhornManagerAverageMemoryUsageBytes": {
          "dataType": "float"
        },
        "longhornNamespaceUid": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornNodeCount": {
          "dataType": "float"
        },
        "longhornNodeDiskHDDCount": {
          "dataType": "float"
        },
        "longhornNodeDiskNVMeCount": {
          "dataType": "float"
        },
        "longhornNodeDiskSSDCount": {
          "dataType": "float"
        },
        "longhornSettingBackingImageCleanupWaitInterval": {
          "dataType": "float"
        },
        "longhornSettingBackingImageRecoveryWaitInterval": {
          "dataType": "float"
        },
        "longhornSettingBackupConcurrentLimit": {
          "dataType": "float"
        },
        "longhornSettingBackupstorePollInterval": {
          "dataType": "float"
        },
        "longhornSettingConcurrentAutomaticEngineUpgradePerNodeLimit": {
          "dataType": "float"
        },
        "longhornSettingConcurrentReplicaRebuildPerNodeLimit": {
          "dataType": "float"
        },
        "longhornSettingConcurrentVolumeBackupRestorePerNodeLimit": {
          "dataType": "float"
        },
        "longhornSettingDefaultReplicaCount": {
          "dataType": "float"
        },
        "longhornSettingEngineReplicaTimeout": {
          "dataType": "float"
        },
        "longhornSettingFailedBackupTtl": {
          "dataType": "float"
        },
        "longhornSettingGuaranteedInstanceManagerCpu": {
          "dataType": "float"
        },
        "longhornSettingRecurringFailedJobsHistoryLimit": {
          "dataType": "float"
        },
        "longhornSettingRecurringSuccessfulJobsHistoryLimit": {
          "dataType": "float"
        },
        "longhornSettingReplicaFileSyncHttpClientTimeout": {
          "dataType": "float"
        },
        "longhornSettingReplicaReplenishmentWaitInterval": {
          "dataType": "float"
        },
        "longhornSettingRestoreConcurrentLimit": {
          "dataType": "float"
        },
        "longhornSettingStorageMinimalAvailablePercentage": {
          "dataType": "float"
        },
        "longhornSettingStorageOverProvisioningPercentage": {
          "dataType": "float"
        },
        "longhornSettingStorageReservedPercentageForDefaultDisk": {
          "dataType": "float"
        },
        "longhornSettingSupportBundleFailedHistoryLimit": {
          "dataType": "float"
        },
        "longhornVolumeAccessModeRwoCount": {
          "dataType": "float"
        },
        "longhornVolumeAccessModeRwxCount": {
          "dataType": "float"
        },
        "longhornVolumeAccessModeUnknownCount": {
          "dataType": "float"
        },
        "longhornVolumeAverageActualSizeBytes": {
          "dataType": "float"
        },
        "longhornVolumeAverageNumberOfReplicas": {
          "dataType": "float"
        },
        "longhornVolumeAverageSizeBytes": {
          "dataType": "float"
        },
        "longhornVolumeAverageSnapshotCount": {
          "dataType": "float"
        },
        "longhornVolumeDataLocalityBestEffortCount": {
          "dataType": "float"
        },
        "longhornVolumeDataLocalityDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeDataLocalityStrictLocalCount": {
          "dataType": "float"
        },
        "longhornVolumeFrontendBlockdevCount": {
          "dataType": "float"
        },
        "longhornVolumeFrontendIscsiCount": {
          "dataType": "float"
        },
        "longhornVolumeOfflineReplicaRebuildingDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeOfflineReplicaRebuildingEnabledCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaAutoBalanceDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaSoftAntiAffinityFalseCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaZoneSoftAntiAffinityTrueCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaDiskSoftAntiAffinityTrueCount": {
          "dataType": "float"
        },
        "longhornVolumeRestoreVolumeRecurringJobFalseCount": {
          "dataType": "float"
        },
        "longhornVolumeSnapshotDataIntegrityDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeSnapshotDataIntegrityFastCheckCount": {
          "dataType": "float"
        },
        "longhornVolumeUnmapMarkSnapChainRemovedFalseCount": {
          "dataType": "float"
        }
      }
    }
--- a/deploy/upgrade_responder_server/chart.yaml
+++ b/deploy/upgrade_responder_server/chart.yaml
@ -0,0 +1,5 @@
 url: https://github.com/longhorn/upgrade-responder.git
 commit: 116f807836c29185038cfb005708f0a8d41f4d35
 releaseName: longhorn-upgrade-responder
 namespace: longhorn-upgrade-responder
--- a/dev/upgrade-responder/README.md
+++ b/dev/upgrade-responder/README.md
@ -0,0 +1,55 @@
 ## Overview
 ### Install
 1. Install Longhorn.
 1. Install Longhorn [upgrade-responder](https://github.com/longhorn/upgrade-responder) stack.
   ```bash
   ./install.sh 
   ```
   Sample output:
   ```shell
   secret/influxdb-creds created
   persistentvolumeclaim/influxdb created
   deployment.apps/influxdb created
   service/influxdb created
   Deployment influxdb is running.
   Cloning into 'upgrade-responder'...
   remote: Enumerating objects: 1077, done.
   remote: Counting objects: 100% (1076/1076), done.
   remote: Compressing objects: 100% (454/454), done.
   remote: Total 1077 (delta 573), reused 1049 (delta 565), pack-reused 1
   Receiving objects: 100% (1077/1077), 55.01 MiB | 18.10 MiB/s, done.
   Resolving deltas: 100% (573/573), done.
   Release "longhorn-upgrade-responder" does not exist. Installing it now.
   NAME: longhorn-upgrade-responder
   LAST DEPLOYED: Thu May 11 00:42:44 2023
   NAMESPACE: default
   STATUS: deployed
   REVISION: 1
   TEST SUITE: None
   NOTES:
   1. Get the Upgrade Responder server URL by running these commands:
     export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=upgrade-responder,app.kubernetes.io/instance=longhorn-upgrade-responder" -o jsonpath="{.items[0].metadata.name}")
     kubectl port-forward $POD_NAME 8080:8314 --namespace default
     echo "Upgrade Responder server URL is http://127.0.0.1:8080"
   Deployment longhorn-upgrade-responder is running.
   persistentvolumeclaim/grafana-pvc created
   deployment.apps/grafana created
   service/grafana created
   Deployment grafana is running.
   [Upgrade Checker]
   URL       : http://longhorn-upgrade-responder.default.svc.cluster.local:8314/v1/checkupgrade
   [InfluxDB]
   URL       : http://influxdb.default.svc.cluster.local:8086
   Database  : longhorn_upgrade_responder
   Username  : root
   Password  : root
   [Grafana]
   Dashboard : http://1.2.3.4:30864
   Username  : admin
   Password  : admin
   ```
--- a/dev/upgrade-responder/install.sh
+++ b/dev/upgrade-responder/install.sh
@ -0,0 +1,424 @@
 #!/bin/bash
 UPGRADE_RESPONDER_REPO="https://github.com/longhorn/upgrade-responder.git"
 UPGRADE_RESPONDER_REPO_BRANCH="master"
 UPGRADE_RESPONDER_VALUE_YAML="upgrade-responder-value.yaml"
 UPGRADE_RESPONDER_IMAGE_REPO="longhornio/upgrade-responder"
 UPGRADE_RESPONDER_IMAGE_TAG="master-head"
 INFLUXDB_URL="http://influxdb.default.svc.cluster.local:8086"
 APP_NAME="longhorn"
 DEPLOYMENT_TIMEOUT_SEC=300
 DEPLOYMENT_WAIT_INTERVAL_SEC=5
 temp_dir=$(mktemp -d)
 trap 'rm -rf "${temp_dir}"' EXIT # -f because packed Git files (.pack, .idx) are write protected.
 cp -a ./* ${temp_dir}
 cd ${temp_dir}
 wait_for_deployment() {
  local deployment_name="$1"
  local start_time=$(date +%s)
  while true; do
    status=$(kubectl rollout status deployment/${deployment_name})
    if [[ ${status} == *"successfully rolled out"* ]]; then
      echo "Deployment ${deployment_name} is running."
      break
    fi
    elapsed_secs=$(($(date +%s) - ${start_time}))
    if [[ ${elapsed_secs} -ge ${timeout_secs} ]]; then
      echo "Timed out waiting for deployment ${deployment_name} to be running."
      exit 1
    fi
    echo "Deployment ${deployment_name} is not running yet. Waiting..."
    sleep ${DEPLOYMENT_WAIT_INTERVAL_SEC}
  done
 }
 install_influxdb() {
    kubectl apply -f ./manifests/influxdb.yaml
    wait_for_deployment "influxdb"
 }
 install_grafana() {
    kubectl apply -f ./manifests/grafana.yaml
    wait_for_deployment "grafana"
 }
 install_upgrade_responder() {
    cat << EOF > ${UPGRADE_RESPONDER_VALUE_YAML}
 applicationName: ${APP_NAME}
 secret:
  name: upgrade-responder-secrets
  managed: true
  influxDBUrl: "${INFLUXDB_URL}"
  influxDBUser: "root"
  influxDBPassword: "root"
 configMap:
  responseConfig: |-
    {
      "versions": [{
        "name": "v1.0.0",
        "releaseDate": "2020-05-18T12:30:00Z",
        "tags": ["latest"]
      }]
    }
  requestSchema: |-
    {
      "appVersionSchema": {
        "dataType": "string",
        "maxLen": 200
      },
      "extraTagInfoSchema": {
        "hostKernelRelease": {
          "dataType": "string",
          "maxLen": 200
        },
        "hostOsDistro": {
          "dataType": "string",
          "maxLen": 200
        },
        "kubernetesNodeProvider": {
          "dataType": "string",
          "maxLen": 200
        },
        "kubernetesVersion": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAllowRecurringJobWhileVolumeDetached": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAllowVolumeCreationWithDegradedAvailability": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAutoCleanupSystemGeneratedSnapshot": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAutoDeletePodWhenVolumeDetachedUnexpectedly": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingAutoSalvage": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingBackupCompressionMethod": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingBackupTarget": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingCrdApiVersion": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingCreateDefaultDiskLabeledNodes": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingDefaultDataLocality": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingDisableRevisionCounter": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingDisableSchedulingOnCordonedNode": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingFastReplicaRebuildEnabled": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingKubernetesClusterAutoscalerEnabled": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingNodeDownPodDeletionPolicy": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingNodeDrainPolicy": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingOfflineReplicaRebuilding": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingOrphanAutoDeletion": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingPriorityClass": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingRegistrySecret": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingRemoveSnapshotsDuringFilesystemTrim": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaAutoBalance": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaSoftAntiAffinity": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaZoneSoftAntiAffinity": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingReplicaDiskSoftAntiAffinity": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingRestoreVolumeRecurringJobs": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSnapshotDataIntegrity": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSnapshotDataIntegrityCronjob": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSnapshotDataIntegrityImmediateCheckAfterSnapshotCreation": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingStorageNetwork": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSystemManagedComponentsNodeSelector": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingSystemManagedPodsImagePullPolicy": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingTaintToleration": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornSettingV2DataEngine": {
          "dataType": "string",
          "maxLen": 200
        }
      },
      "extraFieldInfoSchema": {
        "longhornInstanceManagerAverageCpuUsageMilliCores": {
          "dataType": "float"
        },
        "longhornInstanceManagerAverageMemoryUsageBytes": {
          "dataType": "float"
        },
        "longhornManagerAverageCpuUsageMilliCores": {
          "dataType": "float"
        },
        "longhornManagerAverageMemoryUsageBytes": {
          "dataType": "float"
        },
        "longhornNamespaceUid": {
          "dataType": "string",
          "maxLen": 200
        },
        "longhornNodeCount": {
          "dataType": "float"
        },
        "longhornNodeDiskHDDCount": {
          "dataType": "float"
        },
        "longhornNodeDiskNVMeCount": {
          "dataType": "float"
        },
        "longhornNodeDiskSSDCount": {
          "dataType": "float"
        },
        "longhornSettingBackingImageCleanupWaitInterval": {
          "dataType": "float"
        },
        "longhornSettingBackingImageRecoveryWaitInterval": {
          "dataType": "float"
        },
        "longhornSettingBackupConcurrentLimit": {
          "dataType": "float"
        },
        "longhornSettingBackupstorePollInterval": {
          "dataType": "float"
        },
        "longhornSettingConcurrentAutomaticEngineUpgradePerNodeLimit": {
          "dataType": "float"
        },
        "longhornSettingConcurrentReplicaRebuildPerNodeLimit": {
          "dataType": "float"
        },
        "longhornSettingConcurrentVolumeBackupRestorePerNodeLimit": {
          "dataType": "float"
        },
        "longhornSettingDefaultReplicaCount": {
          "dataType": "float"
        },
        "longhornSettingEngineReplicaTimeout": {
          "dataType": "float"
        },
        "longhornSettingFailedBackupTtl": {
          "dataType": "float"
        },
        "longhornSettingGuaranteedInstanceManagerCpu": {
          "dataType": "float"
        },
        "longhornSettingRecurringFailedJobsHistoryLimit": {
          "dataType": "float"
        },
        "longhornSettingRecurringSuccessfulJobsHistoryLimit": {
          "dataType": "float"
        },
        "longhornSettingReplicaFileSyncHttpClientTimeout": {
          "dataType": "float"
        },
        "longhornSettingReplicaReplenishmentWaitInterval": {
          "dataType": "float"
        },
        "longhornSettingRestoreConcurrentLimit": {
          "dataType": "float"
        },
        "longhornSettingStorageMinimalAvailablePercentage": {
          "dataType": "float"
        },
        "longhornSettingStorageOverProvisioningPercentage": {
          "dataType": "float"
        },
        "longhornSettingStorageReservedPercentageForDefaultDisk": {
          "dataType": "float"
        },
        "longhornSettingSupportBundleFailedHistoryLimit": {
          "dataType": "float"
        },
        "longhornVolumeAccessModeRwoCount": {
          "dataType": "float"
        },
        "longhornVolumeAccessModeRwxCount": {
          "dataType": "float"
        },
        "longhornVolumeAccessModeUnknownCount": {
          "dataType": "float"
        },
        "longhornVolumeAverageActualSizeBytes": {
          "dataType": "float"
        },
        "longhornVolumeAverageNumberOfReplicas": {
          "dataType": "float"
        },
        "longhornVolumeAverageSizeBytes": {
          "dataType": "float"
        },
        "longhornVolumeAverageSnapshotCount": {
          "dataType": "float"
        },
        "longhornVolumeDataLocalityBestEffortCount": {
          "dataType": "float"
        },
        "longhornVolumeDataLocalityDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeDataLocalityStrictLocalCount": {
          "dataType": "float"
        },
        "longhornVolumeFrontendBlockdevCount": {
          "dataType": "float"
        },
        "longhornVolumeFrontendIscsiCount": {
          "dataType": "float"
        },
        "longhornVolumeOfflineReplicaRebuildingDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeOfflineReplicaRebuildingEnabledCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaAutoBalanceDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaSoftAntiAffinityFalseCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaZoneSoftAntiAffinityTrueCount": {
          "dataType": "float"
        },
        "longhornVolumeReplicaDiskSoftAntiAffinityTrueCount": {
          "dataType": "float"
        },
        "longhornVolumeRestoreVolumeRecurringJobFalseCount": {
          "dataType": "float"
        },
        "longhornVolumeSnapshotDataIntegrityDisabledCount": {
          "dataType": "float"
        },
        "longhornVolumeSnapshotDataIntegrityFastCheckCount": {
          "dataType": "float"
        },
        "longhornVolumeUnmapMarkSnapChainRemovedFalseCount": {
          "dataType": "float"
        }
      }
    }
 image:
  repository: ${UPGRADE_RESPONDER_IMAGE_REPO}
  tag: ${UPGRADE_RESPONDER_IMAGE_TAG}
 EOF
    git clone -b ${UPGRADE_RESPONDER_REPO_BRANCH} ${UPGRADE_RESPONDER_REPO}
    helm upgrade --install ${APP_NAME}-upgrade-responder upgrade-responder/chart -f ${UPGRADE_RESPONDER_VALUE_YAML}
    wait_for_deployment "${APP_NAME}-upgrade-responder"
 }
 output() {
    local upgrade_responder_service_info=$(kubectl get svc/${APP_NAME}-upgrade-responder --no-headers)
    local upgrade_responder_service_port=$(echo "${upgrade_responder_service_info}" | awk '{print $5}' | cut -d'/' -f1)
    echo  # a blank line to separate the installation outputs for better readability.
    printf "[Upgrade Checker]\n"
    printf "%-10s: http://${APP_NAME}-upgrade-responder.default.svc.cluster.local:${upgrade_responder_service_port}/v1/checkupgrade\n\n" "URL"
    printf "[InfluxDB]\n"
    printf "%-10s: ${INFLUXDB_URL}\n" "URL"
    printf "%-10s: ${APP_NAME}_upgrade_responder\n" "Database"
    printf "%-10s: root\n" "Username"
    printf "%-10s: root\n\n" "Password"
    local public_ip=$(curl -s https://ifconfig.me/ip)
    local grafana_service_info=$(kubectl get svc/grafana --no-headers)
    local grafana_service_port=$(echo "${grafana_service_info}" | awk '{print $5}' | cut -d':' -f2 | cut -d'/' -f1)
    printf "[Grafana]\n"
    printf "%-10s: http://${public_ip}:${grafana_service_port}\n" "Dashboard"
    printf "%-10s: admin\n" "Username"
    printf "%-10s: admin\n" "Password"
 }
 install_influxdb
 install_upgrade_responder
 install_grafana
 output
--- a/dev/upgrade-responder/manifests/grafana.yaml
+++ b/dev/upgrade-responder/manifests/grafana.yaml
@ -0,0 +1,86 @@
 ---
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
  name: grafana-pvc
 spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 2Gi
 ---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  labels:
    app: grafana
  name: grafana
 spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      securityContext:
        fsGroup: 472
        supplementalGroups:
          - 0
      containers:
        - name: grafana
          image: grafana/grafana:7.1.0
          imagePullPolicy: IfNotPresent
          env:
          - name: GF_INSTALL_PLUGINS
            value: "grafana-worldmap-panel"
          ports:
            - containerPort: 3000
              name: http-grafana
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /robots.txt
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 3000
            timeoutSeconds: 1
          resources:
            requests:
              cpu: 250m
              memory: 750Mi
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: grafana-pv
      volumes:
        - name: grafana-pv
          persistentVolumeClaim:
            claimName: grafana-pvc
 ---
 apiVersion: v1
 kind: Service
 metadata:
  name: grafana
 spec:
  ports:
    - port: 3000
      protocol: TCP
      targetPort: http-grafana
  selector:
    app: grafana
  sessionAffinity: None
  type: LoadBalancer
--- a/dev/upgrade-responder/manifests/influxdb.yaml
+++ b/dev/upgrade-responder/manifests/influxdb.yaml
@ -0,0 +1,90 @@
 apiVersion: v1
 kind: Secret
 metadata:
  name: influxdb-creds
  namespace: default
 type: Opaque
 data:
  INFLUXDB_HOST: aW5mbHV4ZGI=  # influxdb
  INFLUXDB_PASSWORD: cm9vdA==  # root
  INFLUXDB_USERNAME: cm9vdA==  # root
 ---
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
  name: influxdb
  namespace: default
  labels:
    app: influxdb
 spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 2Gi
 ---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  labels:
    app: influxdb
  name: influxdb
  namespace: default
 spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: influxdb
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: influxdb
    spec:
      containers:
      - image: docker.io/influxdb:1.8.10
        imagePullPolicy: IfNotPresent
        name: influxdb
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        envFrom:
        - secretRef:
            name: influxdb-creds
        volumeMounts:
        - mountPath: /var/lib/influxdb
          name: var-lib-influxdb
      volumes:
      - name: var-lib-influxdb
        persistentVolumeClaim:
          claimName: influxdb
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
 ---
 apiVersion: v1
 kind: Service
 metadata:
  labels:
    app: influxdb
  name: influxdb
  namespace: default
 spec:
  ports:
  - port: 8086
    protocol: TCP
    targetPort: 8086
  selector:
    app: influxdb
  sessionAffinity: None
  type: ClusterIP
--- a/enhancements/20210624-label-driven-recurring-job.md
+++ b/enhancements/20210624-label-driven-recurring-job.md
@ -976,11 +976,11 @@ Scenario: test recurring job concurrency
         create volume `test-job-4`.
         create volume `test-job-5`.
-    Then moniter the cron job pod log.
+    Then monitor the cron job pod log.
    And should see 2 jobs created concurrently.
    When update `snapshot1` recurring job with `concurrency` set to `3`.
-    Then moniter the cron job pod log.
+    Then monitor the cron job pod log.
    And should see 3 jobs created concurrently.
 ### Upgrade strategy
--- a/enhancements/20210701-backing-image.md
+++ b/enhancements/20210701-backing-image.md
@ -329,7 +329,7 @@ After the enhancement, users can directly specify the BackingImage during volume
        - Longhorn needs to verify the BackingImage if it's specified.
        - For restore/DR volumes, the BackingImage name stored in the backup volume will be used automatically if users do not specify the BackingImage name. Verify the checksum before using the BackingImage.
    - Snapshot backup:
-        - BackingImage name and checksum will be recored into BackupVolume now.
+        - BackingImage name and checksum will be record into BackupVolume now.
    - BackingImage creation:
        - Need to create both BackingImage CR and the BackingImageDataSource CR. Besides, a random ready disk will be picked up so that Longhorn can prepare the 1st file for the BackingImage immediately.
    - BackingImage get/list:
@ -353,7 +353,7 @@ After the enhancement, users can directly specify the BackingImage during volume
 - The server will download the file immediately once the type is `download` and the server is up.
    - A cancelled context will be put the HTTP download request. When the server is stopped/failed while downloading is still in-progress, the context can help stop the download.
    - The service will wait for 30s at max for download start. If time exceeds, the download is considered as failed.
-    - The download file is in `<Disk path in containter>/tmp/<BackingImage name>-<BackingImage UUID>`
+    - The download file is in `<Disk path in container>/tmp/<BackingImage name>-<BackingImage UUID>`
    - Each time when the image downloads a chunk of data, the progress will be updated. For the first time updating the progress, it means the downloading starts and the state will be updated from `starting` to `in-progress`.    
 - The server is ready for handling the uploaded data once the type is `upload` and the server is up.
    - The query `size` is required for the API `upload`.
--- a/enhancements/20220110-extend-csi-snapshot-to-support-longhorn-snapshot.md
+++ b/enhancements/20220110-extend-csi-snapshot-to-support-longhorn-snapshot.md
@ -250,7 +250,7 @@ Integration test plan.
      * Scale down the workload to detach the `test-vol`
      * Create the same PVC `test-restore-pvc` as in the `Source volume is attached && Longhorn snapshot exist` section
      * Verify that PVC provisioning failed because the source volume is detached so Longhorn cannot verify the existence of the Longhorn snapshot in the source volume.
-      * Scale up the workload to attache `test-vol`  
+      * Scale up the workload to attach `test-vol`  
      * Wait for PVC to finish provisioning and be bounded
      * Attach the PVC `test-restore-pvc` and verify the data
      * Delete the PVC
--- a/enhancements/20220420-longhorn-snapshot-crd.md
+++ b/enhancements/20220420-longhorn-snapshot-crd.md
@ -0,0 +1,162 @@
 # Longhorn Snapshot CRD
 ## Summary
 Supporting Longhorn snapshot CRD allows users to query/create/delete volume snapshots using kubectl. This is one step closer to making kubectl as Longhorn CLI. Also, this will be a building block for the future auto-attachment/auto-detachment refactoring for snapshot creation, deletion, volume cloning.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/3144
 ## Motivation
 ### Goals
 1. Support Longhorn snapshot CRD to allow users to query/create/delete volume snapshots using kubectl.
 2. A building block for the future auto-attachment/auto-detachment refactoring for snapshot creation, deletion, volume cloning.
 3. Pay attention to scalability problem. A cluster with 1k volumes might have 30k snapshots. We should make sure not to overload the controller work-queue as well as making too many grpc calls to engine processes.
 ## Proposal
 Introduce a new CRD, snapshot CRD and the snapshot controller. The life cycle of a snapshot CR is as below:
 1. Create (by engine monitor/kubectl)
    1. When user create a new snapshot CR, Longhorn try to create a new snapshot
    2. When there is a snapshot in the volume that isn't corresponding to any snapshot CR, Longhorn will generate snapshot CR for that snapshot
 2. Update (by snapshot controller)
    1. Snapshot controller will reconcile the snapshot CR status with the snapshot info inside the volume engine
 3. Delete (by engine monitor/kubectl)
    1. When a snapshot CR is deleted (by user or by Longhorn), snapshot controller will make sure that the snapshot are removed from the engine before remove the finalizer and allow the deletion
    2. Deleting volume should be blocked until all of its snapshot are removed
    3. When there is a system generated snapshot CR that isn't corresponding to any snapshot info inside engine status, Longhorn will delete the snapshot CR
 ### User Stories
 Before this enhancement, users have to use Longhorn UI to query/create/delete volume snapshot. For user with only access to CLI,  another option is to use our [Python client](https://longhorn.io/docs/1.2.4/references/longhorn-client-python/). However, the Python client are not as intuitive and easy as using kubectl.
 After this enhancement, users will be able to use kubectl to query/create/delete Longhorn snapshots just like what they can do with Longhorn backups. There is no additional requirement for users to use this feature.
 The experience details should be in the `User Experience In Detail` later.
 #### Story 1
 User wants to limit the snapshot count to save space. Snapshot RecurringJobs set to Retain X number of snapshots do not touch unrelated snapshots, so if one ever changes the name of the RecurringJob, the old snapshots will stick around forever. These then have to be manually deleted in the UI.  There might be some kind of browser automation framework might also work for pruning large numbers of snapshots, but this feels janky. Having a CRD for snapshots would greatly simplify this, as one could prune snapshots using kubectl, much like how one can currently manage backups using kubectl due to the existence of the `backups.longhorn.io` CRD.
 ### User Experience In Detail
 There is no additional requirement for users to use this feature.
 ### API changes
 We don't want to have disruptive changes in this initial version of snapshot CR (e.g., snapshot API create/delete shouldn't change. Snapshot status is still inside the engine status).
 We can wait for the snapshot CRD to be a bit more mature (no issue with scalability) and make the disruptive changes in the next version of snapshot CR (e.g., snapshot API create/delete changes to create/delete snapshot CRs. Snapshot status is removed from inside the engine status)
 ## Design
 ### Implementation Overview
 Introduce a new CRD, snapshot CRD and the snapshot controller.
 The snapshot CRD is:
 ```yaml
 // SnapshotSpec defines the desired state of Longhorn Snapshot
  type SnapshotSpec struct {
  // the volume that this snapshot belongs to.
  // This field is immutable after creation.
  // Required
  Volume string `json:"volume"`
  // require creating a new snapshot
  // +optional
  CreateSnapshot bool `json:"createSnapshot"`
  // The labels of snapshot
  // +optional
  // +nullable
  Labels map[string]string `json:"labels"`
 }
  // SnapshotStatus defines the observed state of Longhorn Snapshot
  type SnapshotStatus struct {
  // +optional
  Parent string `json:"parent"`
  // +optional
  // +nullable
  Children map[string]bool `json:"children"`
  // +optional
  MarkRemoved bool `json:"markRemoved"`
  // +optional
  UserCreated bool `json:"userCreated"`
  // +optional
  CreationTime string `json:"creationTime"`
  // +optional
  Size int64 `json:"size"`
  // +optional
  // +nullable
  Labels map[string]string `json:"labels"`
  // +optional
  OwnerID string `json:"ownerID"`
  // +optional
  Error string `json:"error,omitempty"`
  // +optional
  RestoreSize int64 `json:"restoreSize"`
  // +optional
  ReadyToUse bool `json:"readyToUse"`
 }
 ```
 The life cycle of a snapshot CR is as below:
 1. **Create**
    1. When a snapshot CR is created, Longhorn mutation webhook will:
        1. Add a volume label `longhornvolume: <VOLUME-NAME>` to the snapshot CR. This allow us to efficiently find snapshots corresponding to a volume without having listing potentially thoundsands of snapshots.
        1. Add `longhornFinalizerKey` to snapshot CR to prevent it from being removed before Longhorn has change to clean up the corresponding snapshot
        1. Populate the value for `snapshot.OwnerReferences` to uniquely identify the volume of this snapshot. This field contains the volume UID to uniquely identify the volume in case  the old volume was deleted and a new volume was created with the same name.
    2. For user created snapshot CR, the field `Spec.CreateSnapshot` should be set to `true` indicating that Longhorn should provision a new snapshot for this CR.
        1. Longhorn snapshot controller will pick up this CR, check to see if there already is a snapshot inside the `engine.Status.Snapshots`.
           1. If there is there already a snapshot inside engine.Status.Snapshots, update the snapshot.Status with the snapshot info inside `engine.Status.Snapshots`
           2. If there isn't a snapshot inside `engine.Status.Snapshots` then:
               1.  making a call to engine process to check if there already a snapshot with the same name. This is to make sure we don't accidentally create 2 snapshots with the same name. This logic can be remove after [the issue](https://github.com/longhorn/longhorn/issues/3844) is resolved
               1. If the snapshot doesn't inside the engine process, make another call to create the snapshot
    3. For the snapshots that are already exist inside `engine.Status.Snapshots` but doesn't have corresponding snapshot CRs (i.e., system generated snapshots), the engine monitoring will generate snapshot CRs for them. The snapshot CR generated by engine monitoring with have `Spec.CreateSnapshot` set to `false`, Longhorn snapshot controller will not create a snapshot for those CRs. The snapshot controller only sync status for those snapshot CRs
 2. **Update**
    1. Snapshot CR spec and label are immutable after creation. It will be protected by the admission webhook
    2. Sync the snapshot info from `engine.Status.Snapshots` to the `snapshot.Status`.
    3. If there is any error or if the snapshot is marked as removed, set `snapshot.Status.ReadyToUse` to `false`
    4. If there there is no snapshot info inside `engine.Status.Snapshots`, mark the `snapshot.Status.ReadyToUse` to `false`and populate the `snapshot.Status.Error` with the lost message. This snapshot will eventually be updated again when engine monitoring update `engine.Status.Snapshots` or it may be cleanup as the section below
 4. **Delete**
    1. Engine monitor will responsible for removing all snapshot CRs that don't have a matching snapshot info and are in one of the following cases:
       1. The snapshot CRs with `Spec.CreateSnapshot: false` (snapshot CR that is auto generated by the engine monitoring)
       2. The snapshot CRs with `Spec.CreateSnapshot: true` and `snapCR.Status.CreationTime != nil` (snapshot CR that has requested a new snapshot and the snapshot has already provisioned before but no longer exist now)
    2. When a snapshot CR has deletion timestamp set, snapshot controller will:
        1. Check to see if the actual snapshot inside engine process exist.
            1. If it exist do:
                1. if has not been marked as removed, issue grpc call to engine process to remove the snapshot
                2. Check if the engine is in the purging state, if not issue a snapshot purge call to engine process
            2. If it doesn't exist, remove the `longhornFinalizerKey` to allow the deletion of the snapshot CR
 ### Test plan
 Integration test plan.
 For engine enhancement, also requires engine integration test plan.
 ### Upgrade strategy
 Anything that requires if user want to upgrade to this enhancement
 ## Note [optional]
 How do we address scalability issue?
 1. Controller workqueue
    1. Disable resync period for snapshot informer
    1. Enqueue snapshot only when:
        1. There is a change in snapshot CR
        1. There is a change in `engine.Status.CurrentState` (volume attach/detach event), `engine.Status.PurgeStatus` (for snapshot deletion event), `engine.Status.Snapshots` (for snapshot creation/update event)
 1. This enhancement proposal doesn't make additional call to engine process comparing to the existing design.
 ## Todo
 For the special snapshot `volume-head`, we don't create a snapshot CR for this special snapshot because:
 1. From the usecase perspective, user cannot delete this snapshot anyway so there is no need to generate this snapshot
 1. The name `volume-head` is not globally uniquely, we might have to include volume name if we want to generate this snapshot CR
 1. We would have to implement special logic to prevent user from deleting this special CR
 1. On the flip side, if we generate this special CR, user will have a complete picture of the snapshot chain
 2. The VolumeHead CR may suddenly point to another actual file during the snapshot creation.
--- a/enhancements/20220428-storage-network-through-grpc-proxy.md
+++ b/enhancements/20220428-storage-network-through-grpc-proxy.md
@ -51,7 +51,7 @@ https://github.com/longhorn/longhorn/issues/3546
 - Introduce a new gRPC server in Instance Manager.
- Keep re-usable connections between Manager and Instance Managers.
+- Keep reusable connections between Manager and Instance Managers.
 - Allow Manager to fall back to engine binary call when communicating with old Instance Manager.
@ -101,7 +101,7 @@ So I can decide when to upgrade the Engine Image.
 1. When updating the setting I see engine/replica instance manager pod and backing image manager pods is restarted.
 1. I attach the volumes.
 1. I describe Engine, Replica, and BackingImageManager, and see the `storageIP` in CR status is in the range of the `NetworkAttachmentDefinition` subnet/CIDR. I also see the `storageIP` is different from the `ip` in CR status.
-1. I describe the Engine and see the `replicaAddressMap` in CR spec and status is using the storage IP. 
+1. I describe the Engine and see the `replicaAddressMap` in CR spec and status is using the storage IP.
 1. I see pod logs indicate the network directions.
 #### Story 2 - upgrade
--- a/enhancements/20220727-dedicated-recovery-backend-for-rwx-volume-nfs-server.md
+++ b/enhancements/20220727-dedicated-recovery-backend-for-rwx-volume-nfs-server.md
@ -0,0 +1,196 @@
 # Dedicated Recovery Backend for RWX Volume's NFS Server
 ## Summary
 A NFS server located within the share-manager pod is a key component of a RWX volume. The share-manager controller will recreate the share-manager pod and attach the volume to another node while the node where the share-manager pod is running is down.  However, the failover cannot work correctly because the NFS server lacks an appropriate recovery backend and the connection information cannot be persistent. As a result, the client's workload will be interrupted during the failover. To make the NFS server have failover capability, we want to implement a dedicated recovery backend  and associated modifications for Longhorn.
 ### Related Issues
 [https://github.com/longhorn/longhorn/issues/2293](https://github.com/longhorn/longhorn/issues/2293)
 ## Motivation
 ### Goals
 - Implement a dedicated recovery backend for Longhorn and make the NFS server highly available.
 ### Non-goals
 - Active/active or Active/passive NFS server pattern
 ## Proposal
 To support NFS server's failover capability, we need to change both the client and server configurations. A dedicated recovery backend for Kubernetes and Longhorn is also necessary.
 In the implementation, we will not implement the active/active or active/passive server pattern. Longhorn currently supports local filesystems such as ext4 and xfs. Thus, any change in the node, which is providing service, cannot update to the standby node. The limitation will hinder the active/active design. Currently, the creation of an engine process needs at least one replica and then exports the iSCSI frontend. That is, the standby engine process of the active/passive configuration is not allowable in current Longhorn architecture.
 ### User Stories
 While the node where the share-manager pod is running is down, the share-manager controller will recreate the share-manager pod and attach the volume to another node.  However, the failover cannot work correctly because the connection information are lost after restarting the NFS server. As a result, the locks cannot be reclaimed correctly, and the interruptions of the clients’ filesystem operations happen.
 ### User Experience In Detail
 1. The changes and improvements should not impact the usage of the RWX volumes.
 2. NFS filesystem operation will not be interrupted after the failover of the share-manager pod.
 3. After the crash of the share-manager pod, the application on the client side’s IO operations will be stuck until the share-manager and NFS are recreated.
 4. To make the improvement work, users have to make sure the hostname of each node in the Longhorn system is unique by checking each node's hostname using `hostname` command.
 5. To shorten the failover time, users can
    - Multiple coredns pods in the Kubernetes cluster to ensure the recovery backend be always accessible.
    - Reduce the NFS server's `Grace_Period` and `Lease_Lifetime`. By default, `Grace_Period` and `Lease_Lifetime` are 90 and 60 seconds, respectively. However, the value can be reduced to a smaller value for early termination of the grace period at the expense of security. Please refer to [Long client timeouts when failing over the NFS Ganesha IP resource](https://www.suse.com/support/kb/doc/?id=000019374).
    - Reduce `node-monitor-period` and `node-monitor-grace-period` values in the Kubelet. The unresponsive node will be marked as `NotReady` and speed up the NFS server's failover process.
 ## Design
 ### Implementation Overview
 - **longhorn-manager**
    The speed of a share-manager pod and volume's failover is affected by the cluster's settings and resources, so it is unpredictable how long it takes to failover. Thus, the NFS client mount options `soft, timeo=30, retrans=3` are replaced with `hard`.
 - **share-manager**
    To allow the NFSv4 clients to reclaim locks after the failover of the NFS server, the grace period is enabled by setting
    - Lease_Lifetime = 60
    - Grace_Period = 90
    Additionally, set `NFS_Core_Param.Clustered` to `false`. The NFS server will use the hostname rather than such as `node0` in the share-manager pod, which is same as the name of the share-manager pod, to create a corresponding storage in the recovery backend. The unique hostname avoids the naming conflict in the recovery backend.
 - **nfs-ganesha (user-space NFS server)**
    ```
                                 ┌────────────────────────────────────────────────┐
                                 │                     service                    │
                              ┌──►                                                │
                              │  │          longhorn-nfs-recovery-backend         │
                              │  └───────────────────────┬────────────────────────┘
                              │                          │
                          HTTP API         ┌─────────────┴──────────────┐
                              │            │                            │
                              │            │ endpoint 1                 │ endpoint N
     ┌──────────────────────┐ │  ┌─────────▼────────┐          ┌────────▼─────────┐
     │  share-manager pod   │ │  │ recovery-backend │          │ recovery-backend │
     │                      │ │  │      pod         │          │      pod         │
     │ ┌──────────────────┐ │ │  │                  │   ...    │                  │
     │ │    nfs server    ├─┼─┘  │                  │          │                  │
     │ └──────────────────┘ │    │                  │          │                  │
     │                      │    │                  │          │                  │
     └──────────────────────┘    └──────────┬───────┘          └──────────┬───────┘
                                            │                             │
                                            │        ┌─────────────┐      │
                                            └───────►│  configMap  │◄─────┘
                                                     └─────────────┘
    ```
    1. Introduce a recovery-backend service backed by multiple recovery-backend pods. The recover-backend is shared by multiple RWX volumes to reduce the costs of the resources.
    2. Implement a set of dedicating recovery-backend operations for Longhorn in nfs-ganesha
        - recovery_init
            - Create a configmap, `recovery-backend-${share-manager-pod-name}`, storing the client information
        - end_grace
            - Clean up the configmap
        - recovery_read_clids
            - Create the client reclaim list from the configmap
        - add_clid
            - Add the client key (client’s hostname) into the configmap
        - rm_clid
            - Remove the client key (client’s hostname) from the configmap
        - add_revoke_fh
            - Revoke the delegation
    3. Then, the data from the above operations are persisted by sending to the recovery-backend service. The data will be saved in the configmap, `recovery-backend-${share-manager-pod-name}`.
 - **Dedicating Configmap Format**
    ```
    name: `recovery-backend-${share-manager-pod-name}`
    labels:
        longhorn.io/component: nfs-recovery-backend
        ...
    annotations:
        version: 8-bytes random id, e.g. 6SVVI1LE
    data:
    6SVVI1LE: {….json encoded content (containing the client identity information…}
    ```
    One example
    ```
    apiVersion: v1
    data:
    6SVVI1LE: '{"31:Linux NFSv4.1 rancher50-worker1":[],"31:Linux NFSv4.1 rancher50-worker2":[],"31:Linux
        NFSv4.1 rancher50-worker3":[]}'
    kind: ConfigMap
    metadata:
    annotations:
        version: 6SVVI1LE
    creationTimestamp: "2022-12-01T01:27:14Z"
    labels:
        longhorn.io/component: share-manager-configmap
        longhorn.io/managed-by: longhorn-manager
        longhorn.io/share-manager: pvc-de201ca5-ec0b-42ea-9501-253a7935fc3e
    name: recovery-backend-share-manager-pvc-de201ca5-ec0b-42ea-9501-253a7935fc3e
    namespace: longhorn-system
    resourceVersion: "47544"
    uid: 60e29c30-38b8-4986-947b-68384fcbb9ef
    ```
 ### Notice
 - **In the event that the original share manager pod is unavailable, a new share manager pod cannot be created**
    In the client side, IO to the RWX volume will hang until a share-manager pod replacement is successfully created on another node.
 - **Failed to reclaim locks in 90-seconds grace period**
    If locks cannot be reclaimed after a grace period, the locks are discarded and return IO errors to the client. The client reestablishes a new lock. The application should handle the IO error. Nevertheless, not all applications can handle IO errors due to their implementation. Thus, it may result in the failure of the IO operation and the loss of data. Data consistency may be an issue.
 - **If the DNS service goes down, share-manager pod will not be able to communicate with longhorn-nfs-recovery-backend**
    The NFS-ganesha server in the share-manager pod communicates with longhorn-nfs-recovery-backend via the service `longhorn-recovery-backend` IP. Thus, the high availability of the DNS services is recommended for avoiding the communication failure.
 ### Test Plan
 - Setup
    3 worker nodes for the Longhorn cluster
    - Attach 1 RWO volume to node-1
    - Attach 2 RWO volumes to node-2
    - Attach 3 RWO volumes to node-3
 - Tests
    1. Create 1 RWX volume and then run an app pod with the RWX volume on each worker node.Execute the command in each app pod
        `( exec 7<>/data/testfile-${i}; flock -x 7; while date | dd conv=fsync >&7 ; do sleep 1; done )`
        where ${i} is the node number.
        Turn off the node where share-manager is running. Once the share-manager pod is recreated on a different node, check
        - Expect
            - In the client side, IO to the RWX volume will hang until a share-manager pod replacement is successfully created on another node.
            - During the grace period, the server rejects READ and WRITE operations and non-reclaim locking requests (i.e., other LOCK and OPEN operations) with an error of NFS4ERR_GRACE.
            - The clients can continue working without IO error.
            - Lock reclaim process can be finished earlier than the 90-seconds grace period.
            - During the grace period, the server reject READ and WRITE operations and non-reclaim
            - If locks cannot be reclaimed after a grace period, the locks are discarded and return IO errors to the client. The client reestablishes a new lock.
    2. Turn the deployment into a daemonset in [example]([https://github.com/longhorn/longhorn/blob/master/examples/rwx/rwx-nginx-deployment.yaml](https://github.com/longhorn/longhorn/blob/master/examples/rwx/rwx-nginx-deployment.yaml) ) and disable `Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly`. Then, deploy the daemonset with a RWX volume.
        Turn off the node where share-manager is running. Once the share-manager pod is recreated on a different node, check
        - Expect
            - The other active clients should not run into the stale handle errors after the failover.
            - Lock reclaim process can be finished earlier than the 90-seconds grace period.
    3. Multiple locks one single file tested by byte-range file locking
        Each client ([range_locking.c](https://github.com/longhorn/longhorn/files/9208112/range_locking.txt)) in each app pod locks a different range of the same file. Afterwards, it writes data repeatedly into the file.
        Turn off the node where share-manager is running. Once the share-manager pod is recreated on a different node, check
        - The clients continue the tasks after the server's failover without IO or stale handle errors.
        - Lock reclaim process can be finished earlier than the 90-seconds grace period.
 ## Note[optional]
 ### Reference for the NFSv4 implementation
 - [Network File System (NFS) Version 4 Protocol](https://datatracker.ietf.org/doc/html/rfc7530)
 - [Long client timeouts when failing over the NFS Ganesha IP resource](https://www.suse.com/support/kb/doc/?id=000019374)
 - [Necessary NFS Server Cluster Design for NFS Client Lock Preservation](https://www.suse.com/support/kb/doc/?id=000020396)
 - [How NFSv4 file delegations work](https://library.netapp.com/ecmdocs/ECMP1401220/html/GUID-DE6FECB5-FA4D-4957-BA68-4B8822EF8B43.html)
--- a/enhancements/20220913-longhorn-system-backup-restore.md
+++ b/enhancements/20220913-longhorn-system-backup-restore.md
@ -0,0 +1,723 @@
 # Longhorn System Backup/Restore
 ## Summary
 This feature is to support the Longhorn system backup and restore. And also allows the user to rollback the Longhorn
 system to the previous healthy state
 after a failed upgrade.
 Currently, we have documents to guide users on how to restore Longhorn:
 - [Restore to a new cluster using Velero](https://longhorn.io/docs/1.3.0/advanced-resources/cluster-restore/restore-to-a-new-cluster-using-velero/)
 - [Restore to a cluster contains data using Rancher snapshot](https://longhorn.io/docs/1.3.0/advanced-resources/cluster-restore/restore-to-a-cluster-contains-data-using-rancher-snapshot/)
 However, the solution relies on third-party tools, not out-of-the-box, and involves tedious human intervention.
 With this new feature, Longhorn's custom resources will be backed up and bundled into a single system backup file, then
 saved to the remote backup target.
 Later, users can choose a system backup to restore to a new cluster or restore to an existing cluster; nevertheless,
 this allows for cluster rollback to fix the corrupted cluster state just after a failed upgrade.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/1455
 ## Motivation
 ### Goals
 - Support Longhorn system backup by backing up Longhorn custom resources, bundling them to a single file, and uploading
  it to the backup target.
 - Support Longhorn system restore to a new/existing cluster from a system backup.
 - Support Longhorn system restoration to the previous healthy state when encountering the failed upgrade.
 - Support restoring volume from the `lastBackup` when the volume doesn't exist in the cluster during the system restore.
 - Support not restoring volume from the `lastBackup` when the volume exists in the cluster during the system restore.
 ### Non-goals [optional]
 - This feature does not deploy the Longhorn cluster. Users need to have a running Longhorn cluster to run the system
  restore.
 - Do backup/restore the downstream workloads attached to Longhorn volumes. For example, if a cluster has a Pod with a
  PersistentVolumeClaim. Longhorn will restore the PersistentVolumeClaim, PersistentVolume, and Volume only. After a
  successful system restoration, the user can re-deploy the pod with the same manifest.
 - Restore while there are volumes still attached.
 - Automatically create a system backup before an upgrade and run the system restore when the Longhorn upgrade fails.
  Instead, the user needs to run a system backup and restore it manually.
 - Delete resources none-existing in the system backup. We can probably enhance it later and make this an option for the
  users.
 - Support restore from `BackingImage`. **Dependent on https://github.com/longhorn/longhorn/issues/4165**
 ## Proposal
 ### System Backup
 ```
        |------------ [[ longhorn-backup-target ]] ----------|
        |                     controller                     |
        |                                                    |
        v                                                    v
 [ SystemBackup ] ---> [[ longhorn-system-backup ]] ---> [ object store ]
       CR                        controller                  |
                                                             |
                                    backupstore/system-backups/<longhorn-version>/<name>/
                                                             |
                                   [ system-backups.zip ] <--|--> [ system-backup.cfg ]
 ```
 1. Introduce new [v1/systembackups](#longhorn-manager-http-api) HTTP APIs.
 1. Introduce a new [SystemBackup](#manager-systembackup-custom-resource) custom resource definition.
    - A new custom resource triggers the creation of the system [resource](#system-backup-resources) backup.
    - Deleting the custom resource triggers deletion of the system backup in the backup target.
      This behavior is similar to the current backup resource handling.
    - The system backups stored in the backup target will get synced to the custom resources list.
 1. Introduce new responsibility to `longhorn-backup-target` controller.
    - Responsible for syncing system backups in the backup target to the `SystemBackup` list.
 1. Introduce a new `longhorn-system-backup` controller.
    - Responsible for generating the system backup file, bundling them to a single file, and uploading it to the backup
      target.
      1. Generates the Longhorn resources YAML files.
      1. Compress resources YAML files to a zip file.
      1. Upload to the backup target `backupstore/system-backups/<longhorn-version>/<system-backup-name>`.
      >**Note:** Do not create/upload the system backup when the `SystemBackup` is created by the backup target controller.
    - Responsible for deleting system backup in the backup target.
    - Responsible for updating `SystemBackup` status.
    - Responsible for updating the error message in the `SystemBackup` status condition.
      Reference [SystemRestore](#manager-systemrestore-custom-resource) condition as the example.
 1. Introduce `SystemBackup` webhook validator for [condition validation](#validator-system-backup).
 ### System Restore
 ```
                                            [ system-backup.zip ]   [ backups ]]]
                                                            ^          ^
                                                            |          |
            [ system-backup.cfg ] <-- backupstore/system-backups/...   |
                                                            |          |
                                                            |       backupstore/volumes/
                                                            |          |
                                                            |          |
 [ SystemRestore ] ---> [[ longhorn-system-restore ]] ---> [ object-store ]
        CR                      controller                  |          |
                                     |                      |          |
                                     |                      |          |
                                  [ Job ]                   |    [ system-backup.cfg: engine ]
                                     |                      |          |               
                                     v                      |          |               
                [ system-backup.cfg: manager ] ---> [[ longhorn-system-rollout ]]  
                                                             controller
                                                            |          |
                                                            |          v               
                                                            | <--- [ Volume: from backup ]
                                                            |
                                                           V
                                                    [ Resources ]]]]
 ```
 1. Introduce new [v1/systemrestores](#longhorn-manager-http-api) HTTP APIs.
 1. Introduce a new [SystemRestore](#manager-systemrestore-custom-resource) custom resource definition.
    - A new custom resource triggers the creation of a system restore job.
    - Deleting the custom resource triggers the deletion of the system restore job.
 1. Introduce a new `longhorn-system-restore` controller.
    - Responsible for creating a new system restore job that runs a `longhorn-system-rollout` controller.
    - Responsible for deleting the system restore job.
 1. Introduce a new `longhorn-system-rollout` controller. This controller is similar to the `uninstall controller`.
    - Run inside the pod created by the system restore job.
    - Responsible for downloading the system-backup from the backup target.
    - Responsible for restoring [resources](#system-backup-resources) from the system-backup file.
    - Responsible for updating `SystemRestore` status during system restore.
    - Responsible for adding `longhorn.io/last-system-restore`, and `longhorn.io/last-system-restore-at` annotation to the
    restored resources.
    - Responsible for adding `longhorn.io/last-system-restore-backup`annotation to the restored volume resources.
    - Responsible for updating the error message in the [SystemRestore](#manager-systemrestore-custom-resource) status
    condition.
    > **Note:** There are 2 areas covered for cross version restoration:
    > 1. The system restore will use the manager and engine image in the system backup config to run the `longhorn-system-rollout` so the controller is compatible with the restoring resources.
    > 1. When the CustomResourceDefinition is missing the version for the restoring resource, the system restore doesn't replace or remove the existing CustomResourceDefinitions. Instead, the controller adds to its versions. So system restoration doesn't break existing resources.
    >
    > See [Specify multiple versions](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#specify-multiple-versions) for details.
 1. Introduce `SystemRestore` webhook validator for [condition validation](#validator-system-restore).
 ### User Stories
 #### Longhorn system restore
 Before the enhancement, the user can follow the [documents](#summary) to leverage external solutions to backup and
 restore the Longhorn system.
 After the enhancement, the user can backup/restore Longhorn system to/from the backup target using this Longhorn native
 solution.
 #### Upgrade rollback (downgrade)
 Before the enhancement, the user cannot downgrade Longhorn.
 After the enhancement, the user can downgrade Longhorn when there is a pre-upgrade system backup in the backup target.
 > **Note:** For the Longhorn cluster version before v1.4.0, users still need to follow the Longhorn [documents](#summary)
            for backup and restore the Longhorn system.
 ### User Experience In Detail
 ### Longhorn UI
 1. Go to `System Backup` from the Setting drop-down menu.
    ```
    | Dashboard | Node | Volume | Recurring Job | Backup | Setting v |
                                                           + ======================= +
                                                           | General                 |
                                                           | Engine Image            |
                                                           | Orphaned Data           |
                                                           | Backing Image           |
                                                           | Instance Manager Image  |
                                                           | System Backup           |
                                                           + ======================= +
    ```
 1. View `System Backups` and `System Restores` on the same page.
    ```
    System Backups                                                                                      [Custom Column]
    ====================================================================================================================
    [Create] [Delete] [Restore]                                                       [Search Box   v ][__________][Go]
                                                                                          + ======= +
    ======================================================================================| Name    |===================
    [] | Version  | Name   | State   | Error                                              | State   |
    ---+----------+--------+--------+-----------------------------------------------------| Version |-------------------
    [] | 1.4.0    | demo-1 | Error   | error uploading system backup: failed to execute:  + ======= + /engine-binaries/c3y1huang-research-000-lh-ei/longhorn [system-backup
       :          :        :         : upload --source /tmp/demo-2.zip --dest s3://backupbucket@us-east-1/ --name demo-2 --manager-image c3y1huang/research:000-lh-manager
       :          :        :         : --engine-image c3y1huang/research:000-lh-ei], output , stderr, time=\"2022-08-16T03:52:09Z\" level=fatal msg=\"Failed to run upload
       :          :        :         : system-backup command\" error=\"missing required parameter --longhorn-version\"\n, error exit status 1
    [] | 1.4.0    | demo-2 | Ready   |
    ====================================================================================================================
                                                     [<] [1] [>]
    System Restores                                                                                     [Custom Column]
    ====================================================================================================================
    [Delete]                                                                           [Search Box   v ][_________][Go]
                                                                                           + ======= +
    ====================================================================================== | Name    |==================
    [] | Name           | Version   | State        | Age   | Error                         | State   |
    ---+----------------+-----------+--------------+-------+------------------------------ | Version |------------------
    [] | demo-1-restore | v1.4.0    | Completed    | 2m26s |                               + ======= +
    [] | demo-2-foobar  | v1.4.0    | Error        | 64s   | Download: sample error message
    [] | demo-2-restore | v1.4.0    | Initializing | 1s    |
    ===================================================================================================================
                                                     [<] [1] [>]
    ```
 ### System Backup
 ***Longhorn GUI***
  - The user can create system backups to the backup target.
  - The system backup will be uploaded to backup target `backupstore/system-backups/<longhorn-version>/<system-backup-name>`.
  - The user can view the system backup status.
 ***Command kubectl***
  - The user can create `SystemBackup` to backup Longhorn system to the backup target.
    ```yaml
    apiVersion: longhorn.io/v1beta2
    kind: SystemBackup
    metadata:
      name: demo-2
      namespace: longhorn-system
    ```
  - The user can view the system backups.
    ```
    > kubectl -n longhorn-system get lhsb
    NAMESPACE         NAME     VERSION   STATE   CREATED
    longhorn-system   demo-1   v1.4.0    Error   2022-08-23T00:25:29Z
    longhorn-system   demo-2   v1.4.0    Ready   2022-08-24T02:34:57Z
    ```
 ### System Restore
 ***Longhorn GUI***
  - The users can restore a system backup in the backup target.
  - The users can view the system restore status.
  - The users can restore from a different Longhorn version.
 ***Command kubectl***
  - The user can create `SystemRestore` to restore system backup in the backup target.
    ```yaml
    apiVersion: longhorn.io/v1beta2
    kind: SystemRestore
    metadata:
      name: demo-2-restore
      namespace: longhorn-system
    spec:
      systemBackup: demo-2
    ```
  - The users can view the system restores.
    ```
    > kubectl -n longhorn-system get lhsr
    NAME             STATE         AGE
    demo-1-restore   Completed     2m26s
    demo-2-foobar    Error         64s
    demo-2-restore   Initializing  1s
    ```
 ### API changes
 #### Longhorn manager HTTP API
 | Method     | Path                             | Description                                                            |
 | ---------- | -------------------------------- | ---------------------------------------------------------------------- |
 | **POST**   | `/v1/systembackups`              | Generates system backup file and upload to the backup target           |
 | **GET**    | `/v1/systembackups`              | Get all system backups. Including ones already exist in the backup target and ones that are initialized but do not exist in the backup target |
 | **DELETE** | `/v1/systembackups/{name}`       | Delete system backup saved in the backup target                        |
 | **POST**   | `/v1/systemrestores`             | Download a system backup from the backup target and restore it         |
 | **GET**    | `/v1/systemrestores`             | Get all system restores                                                |
 | **DELETE** | `/v1/systemrestores/{name}`      | Delete a system restore                                                |
 |            | `/v1/ws/{period}/systembackups`  | Websocket stream for system backups                                    |
 |            | `/v1/ws/{period}/systemrestores` | Websocket stream for system restores                                   |
 ## Design
 ### Implementation Overview
 #### Manager: SystemBackup custom resource
 ```yaml
 apiVersion: longhorn.io/v1beta2
 kind: SystemBackup
 metadata:
  creationTimestamp: "2022-08-25T02:50:06Z"
  finalizers:
  - longhorn.io
  generation: 1
  labels:
    longhorn.io/version: v1.4.0
  name: demo-2
  namespace: longhorn-system
  resourceVersion: "420138"
  uid: 41aac4e1-4367-4e17-b4b6-cb7c19151442
 spec: {}
 status:
  conditions: null
  createdAt: "2022-08-24T04:44:32Z"
  gitCommit: 95292c60bb17b77591d6dde5c8636fe6bb4de60d-dirty
  lastSyncedAt: "2022-08-25T02:50:19Z"
  managerImage: "longhornio/longhorn-manager:v1.4.0"
  ownerID: ip-10-0-1-105
  state: Ready
  version: v1.4.0
 ```
 #### Manager: sync system backups in the backup target
 ***longhorn-backup-target-controller***
 1. Execute engine binary [system-backup list](#engine-commands).
 1. Check for system backups in the backup target that are not in the `SystemBackup` list.
 1. Create new `SystemBackup`s and label with `longhorn.io/version: <version>` for the non-existing system backups.
 1. Check for system backups in the `SystemBackup` list that are not in the backup target.
 1. Delete `SystemBackup` for the non-existing system backups.
 1. Delete `SystemBackup` custom resources when the backup target is empty.
 ***longhorn-system-backup-controller***
 1. For `SystemBackup` with the `longhorn.io/version: <version>` label, execute the engine binary
   [system-backup get-config](#engine-commands) using the `<version>`.
 1. Update `SystemBackup` status from the [system backup config](#system-backup-cfg).
 #### Manager: system backup to the backup target
 ***POST /v1/systembackups***
 ```golang
 type SystemBackupInput struct {
 	Name string `json:"name"`
 }
 ```
 1. Create `SystemBackup`.
   ```yaml
   apiVersion: longhorn.io/v1beta2
   kind: SystemBackup
   metadata:
     name: <name>
     namespace: longhorn-system
   ```
 1. Return system backup resource.
   <a name="system-backup-resource"></a>
   ```golang
    type SystemBackup struct {
    	client.Resource
    	Name         string                     `json:"name"`
    	Version      string                     `json:"version,omitempty"`
    	ManagerImage string                     `json:"managerImage,omitempty"`
    	State        longhorn.SystemBackupState `json:"state,omitempty"`
    	CreatedAt    string                     `json:"createdAt,omitempty"`
    	Error        string                     `json:"error,omitempty"`
    }
   ```
 ***webhook validator*** <a name="validator-system-backup"></a>
 1. Skip validation for `SystemBackup` created by the backup target controller.
 1. Allow `SystemBackup` to create if met conditions.
   - The backup target is set.
   - System backup does not exist in the backup target.
 ***longhorn-system-backup-controller***
 ```none
 system-backup.zip
 + metadata.yaml
 |
 + yamls/
  + apiextensions/
  | + customresourcedefinitions.yaml
  |  
  + kubernetes/
  | + clusterrolebindings.yaml
  | + clusterroles.yaml
  | + configmaps.yaml
  | + daemonsets.yaml
  | + deployments.yaml
  | + persistentvolumeclaims.yaml
  | + persistentvolumes.yaml
  | + podsecuritypolicies.yaml
  | + rolebindings.yaml
  | + roles.yaml
  | + serviceaccounts.yaml
  | + services.yaml
  |
  + longhorn/
    + engineimages.yaml
    + recurringjobs.yaml
    + settings.yaml
    + volumes.yaml
 ```
 1. Create metadata file.
    ```golang
    type SystemBackupMeta struct {
    	LonghornVersion        string      `json:"longhornVersion"`
    	LonghornGitCommit      string      `json:"longhornGitCommit"`
    	KubernetesVersion      string      `json:"kubernetesVersion"`
    	LonghornNamespaceUUID  string      `json:"longhornNamspaceUUID"`
    	SystemBackupCreatedAt  metav1.Time `json:"systemBackupCreatedAt"`
    	ManagerImage           string      `json:"managerImage"`
    }
    ```
 1. Generated the resource YAML files:
   <a name="system-backup-resources"></a>
   1. Generate the API extension resource YAML file.
      - CustomResoureDefinitions with API group `longhorn.io`.
   1. Generate the Kubernetes resources YAML files.
      - ServiceAccounts in the Longhorn namespace.
      - ClusterRoleBinding with any Longhorn ServiceAccounts in the `subjects`.
      - ClusterRoles with any Longhorn ClusterRoleBindings in the `roleRef`.
      - Roles in Longhorn namespace.
      - PodSecurityPolicies with Longhorn Role in the `rules`.
      - RoleBindings in the Longhorn namespace.
      - DaemonSets in Longhorn namespace.
      - Deployments in Longhorn namespace.
      - `longhorn-storageclass` ConfigMap.
      - Services in Longhorn namespace. The ClusterIP and ClusterIPs will get removed before converting to the YAML.
      - StorageClasses with provisioner `driver.longhorn.io`.
      - PersistentVolumes with Longhorn StorageClass in spec.
      - PersistentVolumeClaims with Longhorn StorageClass in spec.
   1. Generate the Longhorn resources YAML files.
      - Longhorn Settings.
      - Longhorn EngineImages.
      - Longhorn Volumes.
      - Longhorn RecurringJobs.
 1. Archive the files to a zip file.
 1. Execute engine binary [system-backup upload](#engine-commands) to upload to the backup target
   `backupstore/system-backups/<longhorn-version>/<system-backup-name>`.
 #### Manager: list system backups (GET /v1/systembackups)
 1. List `SystemBackup`.
 1. Return collection of [SystemBackup](#system-backup-resource) resource.
 #### Manager: delete system backup in the backup target
 ***DELETE /v1/systembackups/{name}***
 1. Deletes `SystemBackup`.
 ***longhorn-system-backup-controller***
 1. Execute engine binary [system-backup delete](#engine-commands) to remove the system backup
   in the backup target.
 1. Cleanup local generated system backup files and directory that has not been uploaded to the backup target.
 #### Manager: SystemRestore custom resource
 ```yaml
 apiVersion: longhorn.io/v1beta2
 kind: SystemRestore
 metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"longhorn.io/v1beta2","kind":"SystemRestore","metadata":{"annotations":{},"name":"demo-2-restore","namespace":"longhorn-system"},"spec":{"systemBackup":"demo-2"}}
  creationTimestamp: "2022-08-24T04:44:51Z"
  finalizers:
  - longhorn.io
  generation: 1
  name: demo-2-restore
  namespace: longhorn-system
  resourceVersion: "283819"
  uid: ef93355d-3b73-4fdd-bed3-ec5016a6784d
 spec:
  systemBackup: demo-2
 status:
  conditions:
  - lastProbeTime: ""
    lastTransitionTime: "2022-08-24T04:44:59Z"
    message: sample error message
    reason: Download
    status: "True"
    type: Error
  ownerID: ip-10-0-1-113
  sourceURL: s3://backupbucket@us-east-1/backupstore/system-backups/v1.4.0/demo-2
  state: Error
 ```
 #### Manager: restore system backup from the backup target 
 ***POST /v1/systemrestores***
 ```golang
 type SystemRestoreInput struct {
 	Name         string `json:"name"`
 	Version      string `json:"version"`
 	SystemBackup string `json:"systemBackup"`
 }
 ```
 1. Create `SystemRestore`.
 ***webhook validator*** <a name="validator-system-restore"></a>
 1. Allow `SystemRestore` to create if met conditions.
   - All volumes are detached.
   - No other system restore is in progress.
   - The SystemBackup used in the SystemRestore `Spec.SystemBackup` must exist.
 ***longhorn-system-restore-controller***
 1. Get system backup config from the backup target.
 1. Create a system restore job with the manager image from the [system backup config](#system-backup-cfg).
 ```yaml
 apiVersion: batch/v1
 kind: Job
 metadata:
  name: <system-restore-name>
  namespace: longhorn-system
 spec:
  backoffLimit: 3
  template:
    metadata:
      name: <system-restore-name>
    spec:
      containers:
      - command:
        - longhorn-manager
        - system-restore
        - <system-backup name>
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: NODE_NAME
          value: <controller-id>
        image: <manager-image-in-system-backup-config>
        imagePullPolicy: IfNotPresent
        name: <system-restore-name>
        volumeMounts:
        - mountPath: /var/lib/longhorn/engine-binaries/
          name: engine
      nodeSelector:
        kubernetes.io/hostname: <controller-id>
      restartPolicy: OnFailure
      serviceAccount: <longhorn-service-account>
      serviceAccountName: <longhorn-service-account>
      volumes:
      - hostPath:
          path: /var/lib/longhorn/engine-binaries/
          type: ""
        name: engine
 ```
 ***command: system-restore***
 1. Start and run longhorn-system-rollout controller.
 ***longhorn-system-rollout-controller***
 1. Get system backup config from the backup target.
 1. Check and create the engine image in the [system backup config](#system-backup-cfg) if the engine image is not in the cluster.
 1. Execute engine binary [system-backup download](#engine-commands).
 1. Unzip system backup.
 1. Decode the resources from files.
    ```golang
    type SystemBackupLists struct {
    	customResourceDefinitionList *apiextensionsv1.CustomResourceDefinitionList
    	clusterRoleList        *rbacv1.ClusterRoleList
    	clusterRoleBindingList *rbacv1.ClusterRoleBindingList
    	roleList               *rbacv1.RoleList
    	roleBindingList        *rbacv1.RoleBindingList
    	daemonSetList  *appsv1.DaemonSetList
    	deploymentList *appsv1.DeploymentList
    	configMapList             *corev1.ConfigMapList
    	persistentVolumeList      *corev1.PersistentVolumeList
    	persistentVolumeClaimList *corev1.PersistentVolumeClaimList
    	serviceAccountList        *corev1.ServiceAccountList
    	podSecurityPolicyList *policyv1beta1.PodSecurityPolicyList
    	engineImageList  *longhorn.EngineImageList
    	recurringJobList *longhorn.RecurringJobList
    	settingList      *longhorn.SettingList
    	volumeList       *longhorn.VolumeList
    }
    ```
    - Kubernetes resources from files in the `kubernetes` directory.
    - API extension resources from files in the `apiextensions` directory.
    - Longhorn resources from files in the `longhorn` directory.
 1. Restore Setting resources and annotate with `longhorn.io/last-system-restore`, and `longhorn.io/last-system-restore-at`.
 1. Restore resources asynchronously and annotate with `longhorn.io/last-system-restore`, and `longhorn.io/last-system-restore-at`.
    - ServiceAccounts.
    - ClusterRoles.
    - ClusterRoleBindings.
    - CustomResourceDefinitions.
      > **Note:** The controller will not replace the custom resource definitions for version compatibility purposes.
                  Instead, it will add to the existing one if the custom resource definition version is different. Or
                  create if missing.
    - PodSecurityPolicies.
    - Roles.
    - RoleBindings.
    - ConfigMaps.
    - Deployments.
    - DaemonSets.
    - EngineImages.
    - Volumes. Annotate with `longhorn.io/last-system-restore-backup` if the volume is restored from the backup.
    - StorageClasses.
    - PersistentVolumes.
    - PersistentVolumeClaims.
    - RecurringJobs.
 1. Update [SystemRestore](#manager-systemrestore-custom-resource) status and error.
 1. Shutdown longhorn-system-rollout controller.
 #### Engine: commands
 - [BackupStore: `system-backup upload`](#cmd-system-backup-upload)
 - [BackupStore: `system-backup delete`](#cmd-system-backup-delete)
 - [BackupStore: `system-backup download`](#cmd-system-backup-download)
 - [BackupStore: `system-backup list`](#cmd-system-backup-list)
 - [BackupStore: `system-backup get-config`](#cmd-system-backup-get-config)
 #### BackupStore: commands
 <a name="cmd-system-backup-upload"></a>
 ***Command: system-backup upload***
 | Argument         | Usage                             |
 | -----------------| --------------------------------- |
 | 0                | the source local file path        |
 | 1                | the destination system backup URL |
 | Flag           | Usage                                                |
 | -------------- | ---------------------------------------------------- |
 | git-commit     | Longhorn manager git commit of the current cluster   |
 | manager-image  | Longhorn Manager image of the current cluster        |
 | engine-image   | Longhorn default Engine image of the current cluster |
 1. Upload local file to the object store `backupstore/system-backups/<longhorn-version>/<system-backup-name>/system-backup.zip`.
 1. Create system backup config.
    ```golang
    type SystemBackupConfig struct {
    	Name            string
    	Version         string
    	GitCommit       string
    	BackupTargetURL string
    	ManagerImage    string
    	EngineImage     string
    	CreatedAt       time.Time
    	Checksum        string // sha512
    }
    ```
 1. Upload system backup config to the object store `backupstore/system-backups/<longhorn-version>/<system-backup-name>/system-backup.cfg`.
    <a name="system-backup-cfg"></a>
    ```json
    {
      "Name":"demo-2",
      "Version":"v1.4.0",
      "GitCommit":"95292c60bb17b77591d6dde5c8636fe6bb4de60d-dirty",
      "BackupTargetURL":"s3://backupbucket@us-east-1/",
      "ManagerImage":"c3y1huang/research:000-lh-manager",
      "EngineImage":"c3y1huang/research:000-lh-ei",
      "CreatedAt":"2022-08-24T04:44:32.463197176Z",
      "Checksum":"343b328f97f3ee7af6627eed0d9f42662633c0a2348d4eddaa8929a824452fdde0de6f5620c3ea309579bb58381e48bbb013e92492924fcd3dc57006147e2626"
    }
    ```
 <a name="cmd-system-backup-download"></a>
 ***Command: system-backup download***
 | Argument         | Usage                         |
 | -----------------| ----------------------------- |
 | 0                | the source system backup URL  |
 | 1                | the destination local path    |
 1. Download the system backup zip file from object store to the local path.
 1. Verify the checksum of the system backup zip file. Delete the downloaded file when the checksum is mismatched.
 <a name="cmd-system-backup-delete"></a>
 ***Command: system-backup delete***
 | Argument         | Usage                   |
 | -----------------| ----------------------- |
 | 0                | the system backup URL   |
 1. Delete a system backup in the object store.
 <a name="cmd-system-backup-list"></a>
 ***Command: system-backup list***
 | Argument         | Usage                                              |
 | -----------------| -------------------------------------------------- |
 | 0                | the backup target URL where system backup exists   |
 1. List system backups in the object store.
    ```
    map[string]string{
      "demo-1": "backupstore/system-backups/v1.4.0/demo-1",
      "demo-2": "backupstore/system-backups/v1.4.0/demo-2",
    }
    ```
 <a name="cmd-system-backup-get-config"></a>
 ***Command: system-backup get-config***
 | Argument         | Usage                   |
 | -----------------| ----------------------- |
 | 0                | the system backup URL   |
 1. Output the [system backup config](#system-backup-cfg) from the object store.
 ### Test plan
 #### System Backup
 - Test system backup to the backup target.
 - Test system backup should fail when the backup target is empty.
 - Test system backup should fail when the backup target is unreachable.
 #### System Restore
 ***Same Version***
 - Test system restore to the same cluster.
 - Test system restore to a new cluster.
 - Test system restore can restore volume data from the last backup.
 - Test system restore should fail when the volume is attached.
 - Test system restore when another one is in progress.
 - Test system restore sync from the backup target.
 - Test system restore each resource when exist in the cluster.
 - Test system restore each resource when not exist in the cluster.
 - Test system restore failed to unzip.
 - Test system restore failed to restore resources.
 ***Version Jump***
 - Test system restore to lower Longhorn version of each Longhorn installation method (kubectl/helm/Rancher).
 - Test system restore to higher Longhorn version of each Longhorn installation method (kubectl/helm/Rancher).
 - Test system restore to cluster with multiple engine images.
 ### Upgrade strategy
 `None`
 ## Note [optional]
 `None`
--- a/enhancements/20220922-snapshot-checksum-and-bit-rot-detection.md
+++ b/enhancements/20220922-snapshot-checksum-and-bit-rot-detection.md
@ -0,0 +1,189 @@
 # Snapshot Checksum Calculation and Bit Rot Detection
 ## Summary
 Longhorn system supports volume snapshotting and stores the snapshot disk files on the local disk. However, it is impossible to check the data integrity of snapshots due to the lack of the checksums of the snapshots in current implementation. As a result, if the underlying storage bit rots, no way is available to detect the data corruption and repair the replicas. In the enhancement, the snapshot checksum is calculated after the snapshot is taken and is checked periodically. When a corrupted snapshot is detected, a replica rebuild is triggered to repair the snapshot.
 ### Related Issues
 - [[IMPROVEMENT] Introduce checksum for snapshots](https://github.com/longhorn/longhorn/issues/4210)
 - [[FEATURE] automatic identifying of corrupted replica (bit rot detection)](https://github.com/longhorn/longhorn/issues/3198)
 ## Motivation
 ### Goals
 - Automatic snapshot hashing
 - Identify corrupted snapshot
 - Trigger replica rebuild when a corrupted snapshot is detected
 ### Non-goals
 - The hashing/checking mechanism is applied to detached volumes
 - Support concurrent snapshot hashing
  - In current architecture, the instance-manager-r does not have a proxy, so the snapshot requests are directly sent to the replica processes’ sync-agent servers. Hence, the concurrent limit cannot achieved in the instance-manager-r internally.
  - From the benchmarking result, the checksum calculation eats too much io resource and impacts the system performance a lot. We also don’t know if the longhorn disks on a same physical disk or not. If they are on the same physical disk and the concurrent limit is larger than 1, the other workloads will be impacted significantly, and there might be a disaster for the entire system.
 ## Proposal
 ### User Stories
 Bit rot in storage is rare but real, and it can corrupt the data silently. Longhorn supports volume snapshotting and restoring a volume to a previous version. However, due to the lack of the checksums of the snapshots in current implementation, it is impossible to ensure the data integrity of the replicas/snapshots. Although, we provide a method ([ref](https://longhorn.io/docs/1.3.1/advanced-resources/data-recovery/corrupted-replica/)) to identify the corrupted snapshots/replicas, the process is tedious and time-consuming for users.
 ### User Experience In Detail
 1. Users' operations will not be affected by snapshot hashing and checking.
 2. The system will consume computing and disk IO resources while hashing snapshot disk files. In the meantime, the CPU usages are 380m and 900m when computing the CRC64 (ISO) and SHA256 values, respectively. In the implementation, the CRC64 (ISO) is utilized for detecting corruption.
   - The snapshot hashing benchmarking result is provided
   - The read performance will be impacted as well, as summarized in the below table.
     - Environment
        - Host: AWS EC2 c5d.2xlarge
          - CPU: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
          - Memory: 16 GB
          - Network: Up to 10Gbps
        - Kubernetes: v1.24.4+rke2r1
     - Result
       - Disk: 200 GiB NVMe SSD as the instance store
         - 100 GiB snapshot with full random data
        ![Snapshot Hash Performance Impact (SSD)](image/snapshot_hash_ssd_perf.png)
       - Disk: 200 GiB throughput optimized HDD (st1)
          - 30 GiB snapshot with full random data
        ![Snapshot Hash Performance Impact (HDD)](image/snapshot_hash_hdd_perf.png)
 #### CLI
 Add `snapshot hash` and `snapshot hash-status` commands
 - `snaphost hash` issues a snapshot hashing request to engine.
  - Usage: `longhorn --url ${engine-ip}:${engine-port} snapshot hash tcp://${replica-sync-agent-ip}:${replica-sync-agent-port} --snapshot-name ${name}`
 - `snapshot hash-status` requests the snapshot hashing status from engine.
  - Usage: `longhorn --url ${engine-ip}:${engine-port} snapshot hash-status tcp://${replica-sync-agent-ip}:${replica-sync-agent-port}`
 - `snapshot hash-cancel` cancels the snapshot hashing task.
  - Usage: `longhorn --url ${engine-ip}:${engine-port} snapshot hash-cancel tcp://${replica-sync-agent-ip}:${replica-sync-agent-port}`
 #### Engine Proxy gRPC API
 Add `SnapshotHash`, `SnapshotHashStatus` and `SnapshotHashCancel` methods and their request and response messages.
 - `SnapshotHash` issues a snapshot hashing request to engine.
 - `SnapshotHashStatus` requests the snapshot hashing status from engine.
 - `SnapshotHashCancel` cancels the snapshot hashing task.
 #### Replica Sync-Agent gRPC API
 Add `SnapshotHash`, `SnapshotHashStatus` and `SnapshotHashCancel` methods and their request and response messages.
 - `SnapshotHash` issues a snapshot hashing request to replica sync-agent.
 - `SnapshotHashStatus` requests the snapshot hashing status from replica sync-agent.
 - `SnapshotHashCancel` cancels the snapshot hashing task.
 ## Design
 ### Implementation Overview
 #### Global Settings
 - **snapshot-data-integrity**
  - Description: A global setting for enabling or disabling snapshot data integrity checking mode.
  - Type: string
  - Value:
    - disabled: Disable snapshot disk file hashing and data integrity checking.
    - enabled: Enables periodic snapshot disk file hashing and data integrity checking. To detect the filesystem-unaware corruption caused by bit rot or other issues in snapshot disk files, Longhorn system periodically hashes files and finds corrupted ones. Hence, the system performance will be impacted during the periodical checking.
    - fast-check: Enable snapshot disk file hashing and fast data integrity checking. Longhorn system only hashes snapshot disk files if they are not hashed or if the modification time changed. In this mode, filesystem-unaware corruption cannot be detected, but the impact on system performance can be minimized.
  - Default: `disabled`
 - **snapshot-data-integrity-immediate-checking-after-snapshot-creation**
  - Description: Hashing snapshot disk files impacts the performance of the system. The immediate snapshot hashing and checking can be disabled to minimize the impact after creating a snapshot.
  - Type: bool
  - Default: `false`
 - **snapshot-data-integrity-cron-job**
  - Description: The setting is a set of five fields in a line, indicating when Longhorn checks the data integrity of snapshot disk files.
  - Type: string (Cron job format)
  - Default: `0 0 */7 * *` (once a week)
 #### CRDs
 - Volume
  - Add `volume.spec.snapshotDataIntegrity` for setting the volume's snapshot data integrity checking mode. The value can be `ignored`, `disabled`, `enabled` or `fast-check`.
    - `ignored` means the the volume's snapshot check is following the global setting `snapshot-data-integrity`.
    - After upgrading Longhorn-system, the value is set to `ignored` for an existing volumes whose `volume.spec.snapshotDataIntegrity` is not set.
    - For a newly created volume, the value is `ignored` by default.
 - Snapshot
  - Add `snapshot.status.checksum` for recording the snapshot `crc64(iso)` checksum.
 - Node
  - Add `node.status.snapshotPeriodicCheckStatus.state` for indicating current periodic check state. The value can be `idle` or `in-progress`.
  - Add `node.status.snapshotPeriodicCheckStatus.lastCheckedAt` for recording the start timestamp of the last checksum checking.
 #### Automatic Snapshot Checksum Hashing and Checking
 ![Snapshot Checksum Hashing and Checking Flow](image/snapshot_checksum_calculation_flow.png)
 1. Node controller creates a snapshot monitor for hashing snapshot disk files as well as checking their data integrity. The monitor is consist of
    - **1 goroutine `processSnapshotChangeEvent()`**: Send a snapshot hashing/checking task to the `snapshotCheckTaskQueue` workqueue after receiving one snapshot `UPDATE` event.
    - **1 goroutine `processPeriodicSnapshotCheck()`**: Periodically create snapshot hashing/checking tasks. The period is determined by the global setting `snapshot-data-integrity-cron-job`. When the job is started, it populates engines' snapshots and sends snapshot hashing/checking tasks to the `snapshotCheckTaskQueue` channel.
     - **N task workers**: Issue hashing requests to engines and detect corrupted replicas according to the results.
 2. Task workers fetch tasks from `snapshotCheckTaskQueue` and check if the snapshot disk file needs to be hashed. The rules are
    - If one of the following conditions are met, do not hash the file
      - Volume-head disk file, i.e. `Volume Head` in the following figure
      - System-generated snapshot disk file, e.g. `ccb017f6` and `9a8d5c9c`.
      ![Snapshot Hash](image/snapshot_hash.png)
 3. Issue snapshot hashing requests to their associated engines. Then, the checksum of the snapshot disk file is calculated individually in the replica process. To ensure only one in-progress calculation, the worker holds the per-node file lock (`/host/var/lib/longhorn/.lock/hash`) when calculating the checksum to avoid significant storage performance drop caused by the concurrent calculations.
 4. The worker waits until each snapshot disk file's checksum calculation has been completed. It periodically polls the engine and checks the status during the waiting period.
 5. The worker gets the result once the calculation is completed. The result is like
    ```
    map[string]string{
      "pvc-abc-r-001": 0470c08fbc4dc702,
      "pvc-abc-r-002": 0470c08fbc4dc702,
      "pvc-abc-r-003": ce7c12a4d568fddf,
    }
    ```
 6. The **final checksum** is determined by the majority of the checksums with `SilentCorrupted=false` from replicas. For instance, the **final checksum** of the result in 4. is `0470c08fbc4dc702`.
   - When all checksums differ, the **final checksum** is unable to be determined
      - If  `snapshot.status.checksum` is empty
        - Set all replicas to `ERR`
      - If  `snapshot.status.checksum` is already set
        - Use the `snapshot.status.checksum` as the **final checksum**, and set the replicas that have mismatching checksums to `ERR`
   - When the **final checksum** is successfully determined
      - Assign the **final checksum** to `snapshot.status.checksum`
      - Set the replica to `ERR` if its snapshot disk file's checksum is not equal to `snapshot.status.checksum`
   - If the **final checksum** cannot be determined, the event of the corruption detected is also emitted.
     - For example, Longhorn will not do any error handling and just emits a event when the silent corruption is found in a single-replica volume. 
 7. Then, the replicas in `ERR` mode will be rebuilt and fixed. The event of the corruption detected is also emitted.
 #### Snapshot Disk File Hashing in Replica Process
 When the replica process received the request of snapshot disk file hashing, the checking mode is determined by `volume.spec.snapshotDataIntegrity`. If the value is `ignored`, the checking mode follows the global setting `snapshot-data-integrity`.
 - **`fask-check`**
  - Flow
    1. Get `ctime` information of the snapshot disk file.
    2. Get the value of the extended attribute `user.longhorn-system.metadata` recording the checksum and `ctime` of the file in the last calculation. The value of `user.longhorn-system.metadata` is JSON formatted string and records `hashing method`, `checksum`, `ctime` and etc.
    3. Compare the `ctime` from 1. and 2. Recalculate the checksum if one of the conditions is met
       - The two `ctime` are mismatched.
       - 2's `ctime` is not existing.
    4. Ensure that the checksum is reliable by getting the `ctime` information of the disk file again after the checksum calculation.
       - If it is matched with 1's `ctime`, update the extended attribute with the latest result.
       - Instead, it indicates the file is changed by snapshot pruning, merging or other operations. Thus, recalculate the checksum. A maximum retries controls the recalculation.
    5. Return checksum or error to engine.
 - **enabled**
  - Because the silent data corruption in snapshot disk files can be caused by the host's storage device such as bit rot or somewhere within the storage stack. Filesystem cannot be aware of the corruption. To detect the corruption, the checksums of the disk files are always be recalculated and return back to engine. Silent corruption is detected when the disk file's `ctime` matches the `ctime` in the extended attribute, but the checksums do not match. The extended attribute will not be updated, and the `SilentCorrupted` of the hash status will be set to `true`.
 - **disabled**
  - Do nothing.
 ### Test Plan
 **Integration tests**
 - Test snapshot disk files hashing
  - Compare the checksum recorded in `snapshot.status.checksum` and the checksum (calculated by a [3rd-party CRC64 checksum utility](#3rd-party-crc64-checksum-utility-for-test-plan) of each replica's snapshot disk file.
 - Test snapshot disk files check
  - Corrupt a snapshot disk file in one of the replicas. Then, check the corruption is detected by Longhorn, and the replica rebuilding should be triggered.
 ## Note[optional]
 ### 3rd-party CRC64 Checksum Utility for Test Plan
 - Install Java (`apt install default-jre default-jdk`)
 - Download jacksum (`wget https://github.com/jonelo/jacksum/releases/download/v3.4.0/jacksum-3.4.0.jar`)
 - Calculate checksum by `java -jar jacksum-3.4.0.jar -a crc64_go-iso ${path-to-file}`
--- a/enhancements/20221018-record-recurring-jobs-in-the-backup-volume.md
+++ b/enhancements/20221018-record-recurring-jobs-in-the-backup-volume.md
@ -0,0 +1,183 @@
 # Record Recurring Jobs in the Backup Volume
 ## Summary
 Provide a way that users can record recurring jobs/groups during the backup creation and restore them during the backup restoration. The feature will back up all recurring jobs/groups of the volume into the backup volume configuration on the backup target and restore all jobs when users want to create a volume from a backup with recurring jobs/groups stored in the backup volume.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/2227
 ## Motivation
 ### Goals
 1. Backup or update all recurring jobs/groups to the backup volume during the backup creation.
 2. Create recurring jobs/groups and bind them to the volume restored from a backup optionally.
 3. It is backward compatible for current backups w/o recurring jobs info.
 ### Non-goals [optional]
 1. Not support to back up or restore a specific recurring job/group.
 2. A DR volume will not restore recurring jobs/groups during a backup restoration.
 ## Proposal
 1. Add a global boolean setting `restore-volume-recurring-jobs`. Default value is `false`.
   When users create a volume from a backup and this setting is set to be `true`, it will automatically restore all recurring jobs/groups stored in the backup volume.
 2. Add a customized string parameter `RestoreVolumeRecurringJob` in `Volume` CR. Default value is `"ignored"`. `"enabled"` is to restore recurring jobs/groups. By contrast, `"disabled"` is not to restore.
   Users can override the default behavior during the restoration at runtime by this parameter.
 ### User Stories
 #### Story 1
 Users can simply create recurring jobs from restoring a backup created by other Longhorn systems.
 And continue to back up this restoring volume to the backup target with the same recurring jobs settings.
 #### Story 2
 When the users delete recurring jobs of the volume by accident, they could restore some recurring jobs from the backup volume by restoring a backup if they do not want to create recurring jobs manually.
 ### User Experience In Detail
 #### Via Longhorn GUI
 - Users can set `restore-volume-recurring-jobs` to be `true` on the `Settings` page.
 - When users restore a backup to create a volume, they can see the recurring jobs/groups are restored and enabled automatically on the volume details page.
 - Users can check the checkbox `enabled` or `disabled` to override the global setting of restoring recurring jobs/groups.
 #### Via `kubectl`
 - User can use the command `kubectl -n longhorn-system edit settings` to set `restore-volume-recurring-jobs` to be `true`
 - Users can set `Volume.spec.restoreVolumeRecurringJob` to `enabled` or `disabled` to override the global setting of restoring recurring jobs/groups when creating a volume from a backup.
 - When users create a volume by restoring a backup, they can see the recurring jobs/groups are restored as `RecurringJob` CRs and labeled in the `Volume` CR.
 ```yaml
 ...
 kind: Volume
 metadata:
  labels:
    longhornvolume: restore-demo
  name: restore-demo
  namespace: longhorn-system
 spec:
  RestoreVolumeRecurringJob: "enabled"
  fromBackup: "nfs://nfs-sever.com:/opt/shared-path/?backup=backup-f6d9b9caa9444543&volume=backup1"
 ...
 ```
 ### API changes
 Add a string parameter `RestoreVolumeRecurringJob` to the `Volume` struct utilized by the http client,
 This ends up being stored in `Volume.spec.restoreVolumeRecurringJob` of the volume CR.
 ## Design
 ### Implementation Overview
 1. Add a global boolean setting `restore-volume-recurring-jobs`. Default value is `false`. It will restore all recurring jobs/groups of the backup volume during a backup restoration if this setting is set to be `true`.
 2. Add the parameter `RestoreVolumeRecurringJob` into `Volume` struct of api/model.go and volume CR. Default value is `"ignored"`.
 3. Store all recurring jobs information of the volume into the backup volume configuration on the backup target during the backup creation.
   - We had saved the `"RecurringJob":"c-jaim49"` information in the `spec.labels` of the backup CR to show you the backup is created by a recurring job and this information will also be stored into backup volume configuration on the backup target and update to `status.labels` of the backup volume CR but it only contains the recurring job name and it will be changed after any recurring job creates a backup.
   - Now we back up the details of recurring jobs/groups information into backup volume configuration on the backup target and synchronized to `status.labels` of the backup volume CR. When users need to restore recurring jobs/groups to the current Longhorn system or another, it will get the recurring jobs/groups configuration from backup volume CR.
    ```text
    Backup Controller
          queue           ┌───────────────┐         ┌───────────────────────┐
         ┌┐ ┌┐ ┌┐         │               │         │                       │
     ... ││ ││ ││ ──────► │      ...      | ──────► │      reconcile()      │
         └┘ └┘ └┘         │               │         │                       │
                          └───────────────┘         └──────────┬────────────┘
                                                               │                                                        instance-manager
                                                    ┌──────────▼────────────┐         ┌──────────────────────┐         ┌──────────────────────┐
                                                    │                       │         │                      │         │                      │
                                                    │ enableBackupMonitor() │ ──────► │  NewBackupMonitor()  │  ... ─► │   SnapshotBackup()   │  ...
                                                    │                       │         │                      │         │                      │
                                                    └───────────────────────┘         └──────────────────────┘         └──────────────────────┘
    ```
    1. The `backup_controller` will be responsible for collecting recurring jobs information and send it to the backup monitor when detecting a new backup CR created.
    2. The `backup_monitor` will put recurring jobs information with a new key `VolumeRecurringJobs` into the `spec.labels` of the backup CR and trigger the backup creation.
    3. Recurring jobs information in the labels will be stored into the backup volume configuration by `backupstore`.
    Example of recurring jobs/groups information stored in the backup volume configuration.
    ```json
    { ...,
      "Labels": {
        "RecurringJob":"c-jaim49",
        "VolumeRecurringJobInfo": "{
          \"c-jaim49\": {
              \"jobSpec\": {\"name\":\"c-jaim49\",\"task\":\"backup\",\"cron\":\"0/1 * * * *\",\"retain\":3,\"concurrency\":1},
              \"fromGroup\":null,
              \"fromJob\":true
            },
          \"c-qakbzx\": {
            \"jobSpec\":{\"name\":\"c-qakbzx\",\"groups\":[\"default\"],\"task\":\"backup\",\"cron\":\"0 0 * * *\",\"retain\":5,\"concurrency\":3},
            \"fromGroup\":[\"default\"],
            \"fromJob\":false
          },
          \"c-ua7pxz\": {
            \"jobSpec\":{\"name\":\"c-ua7pxz\",\"groups\":[\"testgroup01\"],\"task\":\"backup\",\"cron\":\"0/10 0/2 * * *\",\"retain\":3,\"concurrency\":3},
            \"fromGroup\":[\"testgroup01\"],
            \"fromJob\":true
          }
        }",
        "longhorn.io/volume-access-mode":"rwo"
      },
      ...,
    }
    ```
 4. Create all recurring jobs if they do not exist when restoring a backup with the setting `restore-volume-recurring-jobs` being `true` or `Volume.spec.restoreVolumeRecurringJob` being `"enabled"`.
    ```text
    Volume Controller
          queue           ┌───────────────┐         ┌───────────────────────┐
         ┌┐ ┌┐ ┌┐         │               │         │                       │
     ... ││ ││ ││ ──────► │      ...      | ──────► │      syncVolume()     │
         └┘ └┘ └┘         │               │         │                       │
                          └───────────────┘         └──────────┬────────────┘
                                                               │
                                                   ┌───────────▼─────────────┐
                                                   │                         │
                                                   │  updateRecurringJobs()  │
                                                   │                         │
                                                   └─────────────────────────┘
    ```
    1. Create all recurring jobs gotten from the backup volume CR if they do not exist or configuration is different and set volume labels of recurring jobs to be `"enabled"` before a restoration starts.
 ### Test plan
 #### Prepare
 1. Create a volume and attach it to a node or a workload.
 2. Create some recurring jobs (some are in groups)
 3. Label the volume with created recurring jobs (some are in groups)
 4. Create a backup or wait for a recurring job starting
 5. Wait for backup creation completed.
 6. Check if recurring jobs/groups information is stored in the backup volume configuration on the backup target
 #### Recurring Jobs exist
 1. Create a volume from the backup just created.
 2. Check the volume if it has labels of recurring jobs and groups.
 #### Recurring Jobs do not exist
 1. Delete recurring jobs that are already stored in the backup volume on the backup.
 2. Create a volume from the backup just created.
 3. Check if recurring jobs have been created.
 4. Check if restoring volume has labels of recurring jobs and groups.
 ### Upgrade strategy
 This enhancement doesn't require an upgrade strategy.
--- a/enhancements/20221024-pv-encryption.md
+++ b/enhancements/20221024-pv-encryption.md
@ -2,7 +2,7 @@
 ## Summary
-This enhancement adds support for user configured (storage class, secrets) encrypted volumes, 
+This enhancement adds support for user configured (storage class, secrets) encrypted volumes,
 this in return means that backups of that volume end up also being encrypted.
 ### Related Issues
@ -16,7 +16,7 @@ this in return means that backups of that volume end up also being encrypted.
 ## Motivation
 ### Goals
- user is able to create & use an encrypted volume
+- user is able to create & use an encrypted volume with cipher customization options
 - user is able to configure the keys that are used for encryption
 - user is able to take backups from an encrypted volume
 - user is able to restore an encrypted backup to a new encrypted volume
@ -29,20 +29,20 @@ this in return means that backups of that volume end up also being encrypted.
 ## Proposal
 ### User Stories
-All regular longhorn operations should also be supported for encrypted volumes, 
+All regular longhorn operations should also be supported for encrypted volumes,
 therefore the only user story that is mentioned is
 how to create and use an encrypted volume.
 #### Create and use an encrypted volume
 - create a storage class with (encrypted=true) and either a global secret or a per volume secret
- create the secret for that volume in the configured namespace
+- create the secret for that volume in the configured namespace with customization options of the cipher for instance `cipher`, `key-size` and `hash`
 - create a pvc that references the created storage class
 - volume will be created then encrypted during first use
 - afterwards a regular filesystem that lives on top of the encrypted volume will be exposed to the pod
 ### User Experience In Detail
-Creation and usage of an encrypted volume requires 2 things: 
+Creation and usage of an encrypted volume requires 2 things:
 - the storage class needs to specify `encrypted: "true"` as part of its parameters.
 - secrets need to be created and reference for the csi operations need to be setup.
 - see below examples for different types of secret usage.
@ -85,6 +85,9 @@ metadata:
 stringData:
  CRYPTO_KEY_VALUE: "Simple passphrase"
  CRYPTO_KEY_PROVIDER: "secret" # this is optional we currently only support direct keys via secrets
  CRYPTO_KEY_CIPHER: "aes-xts-plain64" # this is optional
  CRYPTO_KEY_HASH: "sha256" # this is optional
  CRYPTO_KEY_SIZE: "256" # this is optional
 ```
 #### Create storage class that utilizes per volume secrets
@ -112,18 +115,26 @@ parameters:
 ### API changes
-add a `Encrypted` boolean to the `Volume` struct utilized by the http client, 
+add a `Encrypted` boolean to the `Volume` struct utilized by the http client,
 this ends up being stored in `Volume.Spec.encrypted` of the volume cr.
-Storing the `Encrypted` value is necessary to support encryption for RWX volumes. 
+Storing the `Encrypted` value is necessary to support encryption for RWX volumes.
 ## Design
 ### Implementation Overview
 Host requires `dm_crypt` kernel module as well as `cryptsetup` installed.
-We utilize the below parameters from a secret, `CRYPTO_KEY_PROVIDER` allows us in the future to add other key management systems.
+We utilize the below parameters from a secret,
 - `CRYPTO_KEY_PROVIDER` allows us in the future to add other key management systems
 - `CRYPTO_KEY_CIPHER` allow users to choose the cipher algorithm when creating an encrypted volume by `cryptsetup`
 - `CRYPTO_KEY_HASH` specifies the hash used in the LUKS key setup scheme and volume key digest
 - `CRYPTO_KEY_SIZE` sets the key size in bits. The argument has to be a multiple of 8 and the maximum interactive passphrase length is 512 (characters)
 ```yaml
  CRYPTO_KEY_VALUE: "Simple passphrase"
  CRYPTO_KEY_PROVIDER: "secret" # this is optional we currently only support direct keys via secrets
  CRYPTO_KEY_CIPHER: "aes-xts-plain64" # this is optional
  CRYPTO_KEY_HASH: "sha256" # this is optional
  CRYPTO_KEY_SIZE: "256" # this is optional
 ```
 - utilize host `dm_crypt` kernel module for device encryption
@ -135,7 +146,7 @@ We utilize the below parameters from a secret, `CRYPTO_KEY_PROVIDER` allows us i
 - during csi `NodeStageVolume` encrypt (first time use) / open regular longhorn device
  - this exposes a crypto mapped device (/dev/mapper/<volume-name>)
  - mount crypto device into `staging_path`
- during csi `NodeUnstageVolume` unmount `staging_path` close crypto device 
+- during csi `NodeUnstageVolume` unmount `staging_path` close crypto device
 ### Test plan
@ -146,6 +157,14 @@ We utilize the below parameters from a secret, `CRYPTO_KEY_PROVIDER` allows us i
 - create a pod that uses that pvc for a volume mount
 - wait for pod up and healthy
 #### Successful Creation of an encrypted volume with customization of the cipher
 - create a storage class with (encrypted=true) and either a global secret or a per volume secret
 - create the secret with customized options of the cipher for that volume in the configured namespace
 - create a pvc that references the created storage class
 - create a pod that uses that pvc for a volume mount
 - wait for pod up and healthy
 - check if the customized options of the cipher are correct
 #### Missing Secret for encrypted volume creation
 - create a storage class with (encrypted=true) and either a global secret or a per volume secret
 - create a pvc that references the created storage class
--- a/enhancements/20221103-filesystem-trim.md
+++ b/enhancements/20221103-filesystem-trim.md
@ -0,0 +1,87 @@
 # Filesystem Trim
 ## Summary
 Longhorn can reclaim disk space by allowing the filesystem to trim/unmap the unused blocks occupied by removed files.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/836
 ## Motivation
 ### Goals
 1. Longhorn volumes support the operation `unmap`, which is actually the filesystem trim. 
 2. Since some unused blocks are in the snapshots, these blocks can be freed if the snapshots is no longer required.
 ## Proposal
 1. Longhorn tgt should support module `UNMAP`. When the filesystem of a Longhorn volume receives cmd `fstrim`, the iSCSI initiator actually sends this `UNMAP` requests to the target.
   To understand the iscsi protocol message of `UNMAP` then start the implementation, we can refer to Section "3.54 UNMAP command" of [the doc](https://www.seagate.com/files/staticfiles/support/docs/manual/Interface%20manuals/100293068j.pdf).
 2. By design, snapshots of a Longhorn volume are immutable and lots of the blocks of removed files may be in the snapshots. 
   This implicitly means we have to skip these blocks and free blocks in the current volume head only if we do nothing else. It will greatly degrade the effectiveness of this feature.
   To release as much space as possible, we can do unmap for all continuous unavailable (removed or system) snapshots behinds the volume head, which is similar to the snapshot prune operation.
 3. Longhorn volumes won't mark snapshots as removed hence most of the time there is no continuous unavailable snapshots during the trim. 
   To make it more practicable, we introduce a new global setting for all volumes. It automatically marks the latest snapshot and its ancestors as removed and stops at the snapshot containing multiple children. 
   Besides, there is a per-volume option that can overwrite the global setting and directly indicate if this automatic removal is enabled. By default, it will be ignored and the volumes follow the global setting.     
 ### User Stories
 #### Reclaim the space wasted by the removed files in a filesystem
 Before the enhancement, there is no way to reclaim the space. To shrink the volume, users have to launch a new volume with a new filesystem, and copy the existing files from the old volume filesystem to the new one, then switch to use the new volume. 
 After the enhancement, users can directly reclaim the space by trimming the filesystem via cmd `fstrim` or Longhorn UI. Besides, users can enable the new option so that Longhorn can automatically mark the snapshot chain as removed then trim the blocks recorded in the snapshots.  
 ### User Experience In Detail
 1. Users can enable the option for a specific volume by modifying the volume option `volume.Spec.UnmapMarkSnapChainRevmoed`, or directly set the global setting `Remove Snapshots During Filesystem Trim`.
 2. For an existing Longhorn volume that contains a filesystem and there are files removed from the filesystem, users can directly run cmd `fstrim <filesystem mount point>` or Click Longhorn UI button `Trim Filesystem`.
 3. Users will observe that the snapshot chain are marked as removed. And both these snapshots and the volume head will be shrunk.
 ### API Changes
 - Volume APIs: 
  - Add `updateUnmapMarkSnapChainRemoved`: Control if Longhorn will remove snapshots during the filesystem trim, or just follows the global setting.
  - Add `trimFilesystem`: Trim the filesystem of the volume. Best Effort.
 - Engine APIs:
  - Add `unmap-mark-snap-chain-removed`: `--enable` or `disable`. Control if the engine and all its replicas will mark the snapshot chain as removed once receiving a `UNMAP` request.
 ## Design
 ### Implementation Overview
 #### longhorn-manager:
 1. Add a setting `Remove Snapshots During Filesystem Trim`.
 2. Add fields for CRDs: `volume.Spec.UnmapMarkSnapChainRemoved`, `engine.Spec.UnmapMarkSnapChainRemoved`, `replica.Spec.UnmapMarkDiskChainRemoved`.
 3. Add 2 HTTP APIs mentioned above: `updateUnmapMarkSnapChainRemoved` and `trimFilesystem`. 
 4. Update controllers .
    1. `Volume Controller`:
        1. Update the engine and replica field based on `volume.Spec.UnmapMarkSnapChainRemoved` and the global setting. 
        2. Enqueue the change for the field and the global setting. 
    2. `Engine Controller`:
        1. The monitor thread should compare the CR field `engine.Spec.UnmapMarkSnapChainRemoved` with the current option value inside the engine process, 
           then call the engine API `unmap-mark-snap-chain-removed` if there is a mismatching.
        2. The process creation function should specify the option `unmap-mark-snap-chain-removed`.
    3. `Replica Controller`:
        1. The process creation function should specify the option `unmap-mark-disk-chain-removed`.
 #### longhorn-engine:
 1. Update dependency `rancher/tgt`, `longhorn/longhornlib`, and `longhorn/sparse-tools` for the operation `UNMAP` support.
 2. Add new option `unmap-mark-snap-chain-removed` for the engine process creation call. 
   Add new option `unmap-mark-disk-chain-removed` for the replica process creation call.
 3. Add a new API `unmap-mark-snap-chain-removed` to update the field for the engine and all its replicas.
 4. The engine process should be able to recognize the request of `UNMAP` from the tgt, then forwards the requests to all its replicas via the dataconn service. This is similar to data R/W.
 5. When each replica receive a trim/unmap request, it should decide if the snapshot chain can be marked as removed, then collect all trimmable snapshots, punch holes to these snapshots and the volume head, then calculate the trimmed space.
 #### instance-manager:
 - Update the dependencies.
 - Add the corresponding proxy API for the new engine API.
 #### longhorn-ui:
 1. Add 2 new operations for Longhorn volume.
   - API `updateUnmapMarkSnapChainRemoved`: 
     - The backend accepts 3 values of the input `UnmapMarkSnapChainRemoved`: `"enabled"`, `"disabled"`, `"ignored"`.
     - The UI can rename this option to `Remove Current Snapshot Chain during Filesystem Trim`, and value `"ignored"` to `follow the global setting`.
   - API `trimFilesystem`: No input is required.
 2. The volume creation call accepts a new option `UnmapMarkSnapChainRemoved`. This is almost the same as the above update API.
 ### Test Plan
 #### Integration tests
 Test if the unused blocks in the volume head and the snapshots can be trimmed correctly without corrupting other files, and if the snapshot removal mechanism works when the option is enabled or disabled. 
 #### Manual tests
 Test if the filesystem trim works correctly when there are continuous writes into the volume.
 ### Upgrade strategy
 N/A
--- a/enhancements/20221109-support-bundle-enhancement.md
+++ b/enhancements/20221109-support-bundle-enhancement.md
@ -0,0 +1,410 @@
 # Support Bundle Enhancement
 ## Summary
 This feature replaces the support bundle mechanism with the general purpose [support bundle kit](https://github.com/rancher/support-bundle-kit).
 Currently, the Longhorn support bundle file is hard to understand, and analyzing it is difficult.
 With the new support bundle, the user can simulate a mocked Kubernetes cluster that is interactable with `kubectl`. Hence makes the analyzing process more intuitive.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/2759
 ## Motivation
 ### Goals
 - Replace the Longhorn support-bundle generation mechanism with the support bundle manager.
 - Keep the same support bundle HTTP API endpoints.
 - Executing the` support-bundle-kit simulator` on the support bundle can start a mocked Kubernetes API server that is interactable using `kubectl`.
 - Introduce a new `support-bundle-manager-image` setting for easy support-bundle manager image replacement.
 - Introduce a new `support-bundle-failed-history-limit` setting to avoid unexpected increase of failed support bundles.
 ### Non-goals [optional]
 `None`
 ## Proposal
 - Introduce the new `SupportBundle` custom resource definition.
  - Creating a new custom resource triggers the creation of a support bundle manager deployment. The support-bundle manager is responsible for support bundle collection and exposes it to `https://<ip>:8080/bundle`.
  - Deleting the SupportBundle custom resource deletes its owning support bundle manager deployment.
 - Introduce a new `longhorn-support-bundle` controller.
  - Responsible for SupportBundle custom resource status updates and event recordings.
  - [The controller reacts in phases based on the SupportBundle state.](#manager-supportbundle-creation-handling-in-longhorn-support-bundle-controller)
  - Responsible for cleaning up the support bundle manager deployment when the owner `SupportBundle` custom resource is tagged for deletion.
 - There is no change to the HTTP API endpoints. This feature replaces the handler function logic.
 - Introduce a new `longhorn-support-bundle` service account with `cluster-admin` access. The current `longhorn-service-account` service account cannot generate the following resources.
  ```
  Failed to get /api/v1/componentstatuses
  Failed to get /apis/authentication.k8s.io/v1/tokenreviews
  Failed to get /apis/authorization.k8s.io/v1/selfsubjectrulesreviews
  Failed to get /apis/authorization.k8s.io/v1/subjectaccessreviews
  Failed to get /apis/authorization.k8s.io/v1/selfsubjectaccessreviews
  Failed to get /apis/certificates.k8s.io/v1/certificatesigningrequests
  Failed to get /apis/networking.k8s.io/v1/ingressclasses
  Failed to get /apis/policy/v1beta1/podsecuritypolicies
  Failed to get /apis/rbac.authorization.k8s.io/v1/clusterroles
  Failed to get /apis/rbac.authorization.k8s.io/v1/clusterrolebindings
  Failed to get /apis/node.k8s.io/v1/runtimeclasses
  Failed to get /apis/flowcontrol.apiserver.k8s.io/v1beta1/prioritylevelconfigurations
  Failed to get /apis/flowcontrol.apiserver.k8s.io/v1beta1/flowschemas
  Failed to get /api/v1/namespaces/default/replicationcontrollers
  Failed to get /api/v1/namespaces/default/bindings
  Failed to get /api/v1/namespaces/default/serviceaccounts
  Failed to get /api/v1/namespaces/default/resourcequotas
  Failed to get /api/v1/namespaces/default/limitranges
  Failed to get /api/v1/namespaces/default/podtemplates
  Failed to get /apis/apps/v1/namespaces/default/replicasets
  Failed to get /apis/apps/v1/namespaces/default/controllerrevisions
  Failed to get /apis/events.k8s.io/v1/namespaces/default/events
  Failed to get /apis/authorization.k8s.io/v1/namespaces/default/localsubjectaccessreviews
  Failed to get /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers
  Failed to get /apis/networking.k8s.io/v1/namespaces/default/ingresses
  Failed to get /apis/networking.k8s.io/v1/namespaces/default/networkpolicies
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/default/rolebindings
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/default/roles
  Failed to get /apis/storage.k8s.io/v1beta1/namespaces/default/csistoragecapacities
  Failed to get /apis/discovery.k8s.io/v1/namespaces/default/endpointslices
  Failed to get /apis/helm.cattle.io/v1/namespaces/default/helmcharts
  Failed to get /apis/helm.cattle.io/v1/namespaces/default/helmchartconfigs
  Failed to get /apis/k3s.cattle.io/v1/namespaces/default/addons
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/ingressroutetcps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/ingressroutes
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/serverstransports
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/traefikservices
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/middlewaretcps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/middlewares
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/tlsstores
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/tlsoptions
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/default/ingressrouteudps
  Failed to get /api/v1/namespaces/kube-system/bindings
  Failed to get /api/v1/namespaces/kube-system/resourcequotas
  Failed to get /api/v1/namespaces/kube-system/serviceaccounts
  Failed to get /api/v1/namespaces/kube-system/podtemplates
  Failed to get /api/v1/namespaces/kube-system/limitranges
  Failed to get /api/v1/namespaces/kube-system/replicationcontrollers
  Failed to get /apis/apps/v1/namespaces/kube-system/controllerrevisions
  Failed to get /apis/apps/v1/namespaces/kube-system/replicasets
  Failed to get /apis/events.k8s.io/v1/namespaces/kube-system/events
  Failed to get /apis/authorization.k8s.io/v1/namespaces/kube-system/localsubjectaccessreviews
  Failed to get /apis/autoscaling/v1/namespaces/kube-system/horizontalpodautoscalers
  Failed to get /apis/networking.k8s.io/v1/namespaces/kube-system/networkpolicies
  Failed to get /apis/networking.k8s.io/v1/namespaces/kube-system/ingresses
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/rolebindings
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/roles
  Failed to get /apis/storage.k8s.io/v1beta1/namespaces/kube-system/csistoragecapacities
  Failed to get /apis/discovery.k8s.io/v1/namespaces/kube-system/endpointslices
  Failed to get /apis/helm.cattle.io/v1/namespaces/kube-system/helmchartconfigs
  Failed to get /apis/helm.cattle.io/v1/namespaces/kube-system/helmcharts
  Failed to get /apis/k3s.cattle.io/v1/namespaces/kube-system/addons
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/serverstransports
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/middlewaretcps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/middlewares
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/tlsstores
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/ingressrouteudps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/ingressroutes
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/ingressroutetcps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/traefikservices
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/kube-system/tlsoptions
  Failed to get /api/v1/namespaces/cattle-system/limitranges
  Failed to get /api/v1/namespaces/cattle-system/podtemplates
  Failed to get /api/v1/namespaces/cattle-system/resourcequotas
  Failed to get /api/v1/namespaces/cattle-system/serviceaccounts
  Failed to get /api/v1/namespaces/cattle-system/replicationcontrollers
  Failed to get /api/v1/namespaces/cattle-system/bindings
  Failed to get /apis/apps/v1/namespaces/cattle-system/replicasets
  Failed to get /apis/apps/v1/namespaces/cattle-system/controllerrevisions
  Failed to get /apis/events.k8s.io/v1/namespaces/cattle-system/events
  Failed to get /apis/authorization.k8s.io/v1/namespaces/cattle-system/localsubjectaccessreviews
  Failed to get /apis/autoscaling/v1/namespaces/cattle-system/horizontalpodautoscalers
  Failed to get /apis/networking.k8s.io/v1/namespaces/cattle-system/networkpolicies
  Failed to get /apis/networking.k8s.io/v1/namespaces/cattle-system/ingresses
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/cattle-system/roles
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/cattle-system/rolebindings
  Failed to get /apis/storage.k8s.io/v1beta1/namespaces/cattle-system/csistoragecapacities
  Failed to get /apis/discovery.k8s.io/v1/namespaces/cattle-system/endpointslices
  Failed to get /apis/helm.cattle.io/v1/namespaces/cattle-system/helmchartconfigs
  Failed to get /apis/helm.cattle.io/v1/namespaces/cattle-system/helmcharts
  Failed to get /apis/k3s.cattle.io/v1/namespaces/cattle-system/addons
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/tlsoptions
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/traefikservices
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/middlewares
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/ingressroutetcps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/serverstransports
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/ingressroutes
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/middlewaretcps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/tlsstores
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/cattle-system/ingressrouteudps
  Failed to get /api/v1/namespaces/longhorn-system/limitranges
  Failed to get /api/v1/namespaces/longhorn-system/podtemplates
  Failed to get /api/v1/namespaces/longhorn-system/resourcequotas
  Failed to get /api/v1/namespaces/longhorn-system/replicationcontrollers
  Failed to get /api/v1/namespaces/longhorn-system/serviceaccounts
  Failed to get /api/v1/namespaces/longhorn-system/bindings
  Failed to get /apis/apps/v1/namespaces/longhorn-system/replicasets
  Failed to get /apis/apps/v1/namespaces/longhorn-system/controllerrevisions
  Failed to get /apis/events.k8s.io/v1/namespaces/longhorn-system/events
  Failed to get /apis/authorization.k8s.io/v1/namespaces/longhorn-system/localsubjectaccessreviews
  Failed to get /apis/autoscaling/v1/namespaces/longhorn-system/horizontalpodautoscalers
  Failed to get /apis/networking.k8s.io/v1/namespaces/longhorn-system/ingresses
  Failed to get /apis/networking.k8s.io/v1/namespaces/longhorn-system/networkpolicies
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/longhorn-system/rolebindings
  Failed to get /apis/rbac.authorization.k8s.io/v1/namespaces/longhorn-system/roles
  Failed to get /apis/storage.k8s.io/v1beta1/namespaces/longhorn-system/csistoragecapacities
  Failed to get /apis/discovery.k8s.io/v1/namespaces/longhorn-system/endpointslices
  Failed to get /apis/helm.cattle.io/v1/namespaces/longhorn-system/helmchartconfigs
  Failed to get /apis/helm.cattle.io/v1/namespaces/longhorn-system/helmcharts
  Failed to get /apis/k3s.cattle.io/v1/namespaces/longhorn-system/addons
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/serverstransports
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/ingressroutes
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/tlsstores
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/traefikservices
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/tlsoptions
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/middlewares
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/ingressroutetcps
  Failed to get /apis/traefik.containo.us/v1alpha1/namespaces/longhorn-system/middlewaretcps
  ```
 ### User Stories
 #### Support Bundle Generation
 This feature does not alter how the user generates the support bundle on UI.
 #### Mocking Support Bundle Cluster
 The user can simulate a mocked cluster with the support bundle and interact using `kubectl`.
 ### User Experience In Detail
 #### Support Bundle Generation
 1. User clicks `Generate Support BundleFile` in Longhorn UI.
  1. Longhorn creates a `SupportBundle` custom resource.
  1. Longhorn creates a support bundle manager deployment.
 1. User downloads the support bundle as same as before.
  1. Longhorn deletes the `SupportBundle` custom resource.
  1. Longhorn deletes the support bundle manager deployment.
 #### Support Bundle Generation Failed
 1. User clicks `Generate Support BundleFile` in Longhorn UI.
  1. Longhorn creates a SupportBundle custom resource.
  1. Longhorn creates a support bundle manager deployment.
 1. The SupportBundle goes into an error state.
 1. User sees an error on UI. 
  1. Longhorn retains the failed SupportBundle and its support-bundle manager deployment.
 1. User analyzes the failed SupportBundle on the cluster. Or generate a new support bundle so the failed SupportBundle can be analyzed off-site.
 1. User deletes the failed SupportBundle when done with the analysis. Or have Longhorn automatically purge all failed SupportBundles by setting [support bundle failed history limit](#manager-support-bundle-failed-history-limit-setting) to 0.
  1. Longhorn deletes the SupportBundle custom resource.
  1. Longhorn deletes the support bundle manager deployment.
 ### API changes
 #### Longhorn manager HTTP API
 There will be no change to the HTTP API endpoints. This feature replaces the handler function logic.
 | Method   | Path                                              | Description                                           |
 | -------- | ------------------------------------------------- | ----------------------------------------------------- |
 | **POST** | `/v1/supportbundles`                              | Creates SupportBundle custom resource                 |
 | **GET**  | `/v1/supportbundles/{name}/{bundleName}`          | Get the support bundle details from the SuppotBundle custom resource |
 | **GET**  | `/v1/supportbundles/{name}/{bundleName}/download` | Get the support bundle file from `https://<support-bundle-manager-ip>:8080/bundle` |
 ## Design
 ### Implementation Overview
 #### Deployment: longhorn-support-bundle service account
 Collecting the support bundle requires complete cluster access. Hence Longhorn will have a service account dedicated at deployment.
 ```yaml
 ---
 apiVersion: v1
 kind: ServiceAccount
 metadata:
  name: longhorn-support-bundle
  namespace: longhorn-system
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
  name: longhorn-support-bundle
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
 subjects:
 - kind: ServiceAccount
  name: longhorn-support-bundle
  namespace: longhorn-system
 ---
 ```
 #### Manager: HTTP API SupportBundle resource
 ```go
 type SupportBundle struct {
 	client.Resource
 	NodeID             string                      `json:"nodeID"`
 	State              longhorn.SupportBundleState `json:"state"`
 	Name               string                      `json:"name"`
 	ErrorMessage       string                      `json:"errorMessage"`
 	ProgressPercentage int                         `json:"progressPercentage"`
 }
 ```
 #### Manager: POST `/v1/supportbundles`
 - Creates a new `SupportBundle` custom resource.
 #### Manager: GET `/v1/supportbundles/{name}/{bundleName}`
 - Gets the `SupportBundle` custom resource and returns [SupportBundle resource](#manager-http-api-supportbundle-resource).
 #### Manager: GET `/v1/supportbundles/{name}/{bundleName}/download`
 1. Get the support bundle from [https://\<support-bundle-manager-ip>:8080/bundle](https://github.com/rancher/support-bundle-kit/blob/master/pkg/manager/httpserver.go#L104).
 1. Copy the support bundle to the response writer.
 1. Delete the `SupportBundle` custom resource.
 #### Manager: SupportBundle custom resource
 ```yaml
 apiVersion: v1
 items:
 - apiVersion: longhorn.io/v1beta2
  kind: SupportBundle
  metadata:
    creationTimestamp: "2022-11-10T02:35:45Z"
    generation: 1
    name: support-bundle-2022-11-10t02-35-45z
    namespace: longhorn-system
    resourceVersion: "97016"
    uid: a5169448-a6e5-4637-b99a-63b9a9ea0b7f
  spec:
    description: "123"
    issueURL: ""
    nodeID: ""
  status:
    conditions:
    - lastProbeTime: ""
      lastTransitionTime: "2022-11-10T03:35:29Z"
      message: done
      reason: Create
      status: "True"
      type: Manager
    filename: supportbundle_08ccc085-641c-4592-bb57-e05456241204_2022-11-10T02-36-13Z.zip
    filesize: 502608
    image: rancher/support-bundle-kit:master-head
    managerIP: 10.42.2.54
    ownerID: ip-10-0-1-113
    progress: 100
    state: ReadyForDownload
 kind: List
 metadata:master-head
  resourceVersion: ""
 ```
 #### Manager: `support-bundle-manager-image` setting
 The support bundle manager image for the support bundle generation.
 ```
 Category = general
 Type     = string
 Default  = rancher/support-bundle-kit:master-head
 ```
 #### Manager `support-bundle-failed-history-limit` setting
 This setting specifies how many failed support bundles can exist in the cluster.
 The retained failed support bundle is for analysis purposes and needs to clean up manually. Set this value to 0 to have Longhorn automatically purge all failed support bundles.
 ```
 Category = general
 Type     = integer
 Default  = 1
 ```
 #### Manager: validate at SupportBundle creation
 1. Block creation if the number of failed SupportBundle exceeds the [support bundle failed history limit](#manager-support-bundle-failed-history-limit-setting).
 1. Block creation if there is another SupportBundle is in progress. However, skip checking the SupportBundle that is in an error state. We will leave the user to decide what to do with the failed SupportBundles.
 #### Manager: mutate at SupportBundle creation
 1. Add finalizer.
 #### Manager: SupportBundle creation handling in longhorn-support-bundle controller
 This controller handles the support bundle in phases depending on its custom resource state.
 At the end of each phase will update the SupportBundle custom resource state and then returns the queue. The controller picks up the update and enqueues again for the next phase. 
 When there is no state update, the controller automatically queues the handling custom resource until the state reaches `ReadyForDownload` or `Error`.
 **State: None("")** 
 - Update the custom resource image with the setting value.
 - Update the custom resource state to `Started`.
 **State: Started**
 - Update the state to `Generating` when the support bundle manager deployment exists.
 - Create support bundle manager deployment and requeue this phase to check support bundle manager deployment.
 **State: Generating**
 - Update the [SupportBundle](#manager-supportbundle-custom-resource) status base on the support manager [https://\<support-bundle-manager-ip>:8080/status](https://github.com/rancher/support-bundle-kit/blob/master/pkg/manager/httpserver.go#L103):
  - IP
  - file name
  - progress
  - filesize
 - Update the custom resource state to `ReadyForDownload` when progress reached 100.
 #### Manager: SupportBundle error handling in longhorn-support-bundle controller
 - Update the state to `Error` and record the error type condition when the phase encounters unexpected failure.
 - When the [support bundle failed history limit](#manager-support-bundle-failed-history-limit-setting) is 0, update the state to `Purging`.
 **Purging**
 - Delete all failed SupportBundles in the state `Error`.
 #### Manager: SupportBundle deletion handling in longhorn-support-bundle controller
 When the SupportBundle gets marked with `DeletionTimestamp`, the controller updated its state to `Deleting`.
 **Deleting**
 - Delete its support bundle manager deployment.
 - Remove the SupportBundle finalizer.
 #### Manager: SupportBundle purge handling in longhorn-setting controller
 - If the [support bundle failed history limit](#manager-support-bundle-failed-history-limit-setting) is 0, update all failed SupportBundle state to `Purging`.
 ### Test plan
 - Test support bundle generation should be successful.
 - Test support bundle should be cleaned up after download.
 - Test support bundle should retain when generation failed.
 - Test support bundle should generate when the cluster has an existing `SupportBundle` in an error state.
 - Test support bundle should purge when `support bundle failed history limit` is set to 0.
 - Test support bundle cluster simulation.
 ### Upgrade strategy
 `None`
 ## Note [optional]
 `None`
--- a/enhancements/20221123-local-volume.md
+++ b/enhancements/20221123-local-volume.md
@ -0,0 +1,68 @@
 # Local Volume
 ## Summary
 Longhorn can support local volume to provide better IO latencies and IOPS.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/3957
 ## Motivation
 ### Goals
 - Longhorn can support local volume (data locality=strict-local) for providing better IO latencies and IOPS.
 - A local volume can only have one replica.
 - A local volume supports the operations such as snapshot, backup and etc.
 ### Non-goals
 - A local volume's data locality cannot be converted to other modes when volume is not detached.
 - A local volume does not support multiple replicas in the first version. The local replication could be an improvement in the future.
 ## Proposal
 1. Introduce a new type of volume type, a local volume with `strict-local` data locality.
   - Different than a volume with `best-effort` data locality, the engine and replica of a local volume have to be located on the same node.
 2. Unix-domain socket are used instead of TCP between the replica process' data server and the engine.
 3. A local volume supports the existing functionalities such as snapshotting, backup, restore, etc.
 ### User Stories
 Longhorn is a highly available replica-based storage system. As the data path is designed for the replication, a volume with a single replica still suffers from high IO latency. In some cases, the distributed data workloads such as databases already have their own data replication, sharding, etc, so we should provide a volume type for these use cases while supporting existing volume functionalities like snapshotting, backup/restore, etc.
 ### User Experience In Detail
 - The functionalities and behaviors of the volumes with `disabled` and `best-effort` data localities will not be changed.
 - A volume with `strict-local` data locality
   - Only has one replica
   - The engine and replica have to be located on the same node
   - Cannot convert to `disabled` or `best-effort` data locality when the volume is not detached
   - Can convert to `disabled` or `best-effort` data locality when the volume is detached
   - Existing functionalities such as snapshotting, backup, restore, etc. are supported
 ### CLI Changes
 - Add `--volume-name` in engine-binary `replica` command
   - The unix-domain-socket file will be `/var/lib/longhorn/unix-domain-socket/${volume name}.sock`
 - Add `--data-server-protocol` in engine-binary `replica` command
   - Available options are `tcp` (default) and `unix` 
 - Add `--data-server-protocol` in engine-binary `controller` command
   - Available options are `tcp` (default) and `unix` 
 ## Design
 ### Implementation Overview
 #### CRDs
 1. Add a new data locality `strict-local` in `volume.Spec.DataLocality`
 #### Volume Creation and Attachment
 - When creating and attaching a volume with `strict-local` data locality, the replica is scheduled on the node where the engine is located.
 - Afterward, the replica process is created with the options `--volume-name ${volume name}` and `--data-server-protocol unix`.
 - The data server in the replica process is created and listens on a unix-domain-socket file (`/var/lib/longhorn/unix-domain-socket/${volume name}.sock`).
 - Then, the engine process of the volume is created with the option `--data-server-protocol unix`.
 - The client in the engine process connects to the data server in the replica process via the unix-domain-socket file.
 ### Validating Webhook
 - If a volume with `strict-local` data locality, the `numberOfReplicas` should be 1.
 - If a local volume is attached, the conversion between `strict-local` and other data localities is not allowable.
 - If a local volume is attached, the update of the replica count is not allowable.
 ### Test Plan
 #### Integration tests
 1. Successfully create a local volume with `numberOfReplicas=1` and `dataLocality=strict-local`.
 2. Check the validating webhook can reject the following cases when the volume is created or attached
   - Create a local volume with `dataLocality=strict-local` but `numberOfReplicas>1`
   - Update a attached local volume's `numberOfReplicas` to a value greater than one
   - Update a attached local volume's `dataLocality` to `disabled` or `best-effort`
--- a/enhancements/20221205-concurrent-backup-restore-limit.md
+++ b/enhancements/20221205-concurrent-backup-restore-limit.md
@ -0,0 +1,78 @@
 # Concurrent Backup Restore Per Node Limit
 ## Summary
 Longhorn has no boundary on the number of concurrent volume backup restoring.
 Having a new `concurrent-backup-restore-per-node-limit` setting allows the user to limit the concurring backup restoring. Setting this restriction lowers the potential risk of overloading the cluster when volumes restoring from backup concurrently. For ex: during the Longhorn system restore.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/4558
 ## Motivation
 ### Goals
 Introduce a new `concurrent-backup-restore-per-node-limit` setting to define the boundary of the concurrent volume backup restoring.
 ### Non-goals
 `None`
 ## Proposal
 1. Introduce a new `concurrent-backup-restore-per-node-limit` setting.
 1. Track the number of per-node volumes restoring from backup with atomic count (thread-safe) in the engine monitor.
 ### User Stories
 Allow the user to set the concurrent backup restore per node limit to control the risk of cluster overload when Longhorn volume is restoring from backup concurrently.
 ### User Experience In Detail
 1. Longhorn holds the engine backup restore when the number of volume backups restoring on a node reaches the `concurrent-backup-restore-per-node-limit`.
 1. The volume backup restore continues when the number of volume backups restoring on a node is below the limit.
 ## Design
 ### Implementation Overview
 #### The `concurrent-backup-restore-per-node-limit` Setting
 This setting controls how many engines on a node can restore the backup concurrently.
 Longhorn engine monitor backs off when the volume [backup restoring count](#track-the-volume-backup-restoring-per-node) reaches the setting limit.
 Set the value to **0** to disable backup restore.
 ```
 Category = SettingCategoryGeneral,
 Type     = integer
 Default  = 5  # same as the default replica rebuilding number
 ```
 #### Track the volume backup restoring per node
 1. Create a new atomic counter in the engine controller.
   ```
   type EngineController struct {
      restoringCounter util.Counter
   }
   ```
 1. Pass the restoring counter to each of its engine monitors.
   ```
   type EngineMonitor struct {
      restoringCounter util.Counter
   }
   ```
 1. Increase the restoring counter before backup restore.
   > Ignore DR volumes (volume.Status.IsStandby).
 1. Decrease the restoring counter when the backup restore caller method ends
 ### Test plan
 - Test the setting should block backup restore when creating multiple volumes from the backup at the same time.
 - Test the setting should be per-node limited.
 - Test the setting should not have effect on DR volumes.
 ### Upgrade strategy
 `None`
 ## Note [optional]
 `None`
--- a/enhancements/20221213-reimplement-longhorn-engine-with-SPDK.md
+++ b/enhancements/20221213-reimplement-longhorn-engine-with-SPDK.md
@ -0,0 +1,181 @@
 # Reimplement Longhorn Engine with SPDK
 ## Summary
 The Storage Performance Development Kit [SPDK](https://spdk.io) provides a set of tools and C libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance through the use of a number of key techniques:
 * Moving all of the necessary drivers into userspace, which avoids syscalls and enables zero-copy access from the application.
 * Polling hardware for completions instead of relying on interrupts, which lowers both total latency and latency variance.
 * Avoiding all locks in the I/O path, instead relying on message passing.
 SPDK has several features that allow it to perform tasks similar to what the `longhorn-engine` currently needs:
 * [Block Device](https://spdk.io/doc/bdev.html) layer, often simply called bdev, intends to be equivalent to the operating system block storage layer that often sits immediately above the device drivers in a traditional kernel storage stack. SPDK provides also virtual bdev modules which creates block devices on existing bdev, for example Logical Volumes or RAID1.
 * [Logical volumes](https://spdk.io/doc/logical_volumes.html) library is a flexible storage space management system. It allows creating and managing virtual block devices with variable size on top of other bdevs. The SPDK Logical Volume library is built on top [Blobstore](https://spdk.io/doc/blob.html) which is a persistent, power-fail safe block allocator designed to be used as the local storage system backing a higher level storage service, typically in lieu of a traditional filesystem. Logical volumes have a couple of features like Thinly Provisioning and Snapshots similar to what actual Longhorn-Engine provides.
 * [NVMe over Fabrics](https://spdk.io/doc/nvmf.html) is a feature to presents block devices over a fabrics such as Ethernet, supporting RDMA and TCP transports. The standard Linux kernel initiators for NVMe-oF interoperate with these SPDK NVMe-oF targets, so with this feature we can serve bdev over the network or to other processes
 ## Motivation
 These are the reasons that have driven us:
 * Use SPDK to improve performance of Longhorn
 * Use SPDK functionality to improve reliability and robustness
 * Use SPDK to take advantage of the new features that are continuously added to the framework
 ### Goals
 * Implement all actual `longhorn-engine` functionalities
 * Continue to support multiple `longhorn-engine` versions concurrently
 * Maintain as much as possible the same user experience between Longhorn with and without SPDK
 * Lay the groundwork for extending Longhorn to sharding and aggegration of storage devices
 ## Proposal
 SPDK implements a JSON-RPC 2.0 server to allow external management tools to dynamically configure SPDK components ([documentation](https://spdk.io/doc/jsonrpc.html)).
 What we aim is to create an external orchestrator that, with JSON-RPC calls towards multiple instances of `spdk_tgt` app running in different machines, could manage the durability and reliability of data. Actually, not all needed functionalities to do that are already available in SPDK, so some new JSON-RPC commands will be developed over SPDK. This orchestrator is implemented in longhorn manager pods and will use a new process, called `longhorn-spdk-engine` in continuity with actual `longhorn-engine`, to talk with `spdk_tgt`.
 * The main purpose of `longhorn-spdk-engine` is to create and export via NVMe-oF logical volumes from multiple replica nodes (one of them likely local), attach to these volumes on a controller node, use resulting bdevs to create a RAID1 bdev and exporting it via NVMe-oF locally. At this point NVMe Kernel module can be used to connect to this NVMe-oF subsystem and so to create a block device `/dev/nvmeXnY` to be used by the Longhorn CSI driver. In this way we will have multiple replica of the same data written on this block device.
 * Below a diagram that shows the control plane of the proposal ![SPDK New Architecture](./image/spdk-control-plane.png)
 * In release 23.01, support for ublk will be added in SPDK: with this functionality we can directly create a block device without using the NVMe layer on Linux kernel versions >6.0. This will be a quite big enhancement over using NVMe-oF locally.
 The `longhorn-spdk-engine` will be responsible to make all others control operations, like for example creating snapshots over all replicas of the same volume. Other functionalities orchestrated through the engine will be the remote rebuild, a complete rebuild of the entire snapshot stack of a volume needed to add or repair a replica, the backup and restore, export/import of a SPDK logical volumes to/from sparse files stored on an external storage system via S3.
 The `longhorn-spdk-engine` will be developed in Go so maybe we can reuse some code from `longhorn-engine`, for example gRPC handling to receive control commands and error handling during snapshot/backup/restore operations.
 What about the data plane, below a comparison between actual architecture and new design:
 * longhorn-engine ![](./image/engine-data-plane.png)
 * spdk_tgt       
  ![](./image/spdk-data-plane.png)
 ## Design
 ### Implementation Overview
 Actually there is a `longhorn-engine` controller and some `longhorn-engine` replica for every volume to manage. All these instances are started and controlled by the `instance-manager`, so on every node belonging to the cluster we have one instance of `instance-manager` and multiple instances of `longhorn-engine`. Every volume is stored in a sequence of sparse files representing the live data and the snapshots. With SPDK we have a different situation, because `spdk_tgt` can take the control of an entire disk, so in every node we will have a single instance of SPDK that will handle all the volumes created by Longhorn.
 To orchestrate SPDK instances running on different nodes in a way to make up a set of replicas, we will introduce, as discussed before, the `longhorn-spdk-engine`; to make the volume management lighter we will have an instance of the engine per volume. `longhorn-spdk-engine` will implement actual gRPC interface used by `longhorn-engine` to talk with `instance-manager`, so this last one will became the portal to communicate with `longhorn-manager` by different data plane.
 `spdk_tgt` by default starts with a single thread, but it can be configured to use multiple threads: we can have a thread per core available on the CPU. This will increase the performance but comes with the cost of an high CPU utilization. Working in polling mode instead than in interrupt mode, CPU core utilization by a single thread is always rising 100% even with no workload to handle. This could be a problem, so we can configure `spdk_tgt` with dynamic scheduler: in this way, if no workload is present, only one core will be used and only one thread will continue polling. Other thread will be put in a idle state and will become active again only when needed. Moreover, dynamic scheduler has a way to reduce the CPU frequency. (See future work section.)
 ### Snapshots
 When `longhorn-spdk-engine` receive a snapshot request from `instance-manager`, before to proceed all I/O operations over volume's block device `/dev/nvmeXnY` must be stopped to ensure that snapshots over all the replicas contains the same data.
 Actually there is no way to suspend the I/O operations over a block device, so we will have to implement this feature into SPDK. But in RAID bdev there are already some private functions to suspend I/O (they will be used for example in base bdev removing), maybe we can use and improve them. These functions actually enqueue all the I/O operations received during the suspend time.
 Once received a snapshot request, `longhorn-spdk-engine` will call the JSON-RPC to make a snapshot over the local replica of the volume involved. The snapshot RPC command will ensure to freeze all I/O over the logical volume to be snapshotted, so all pending I/O will be executed before the snapshot.
 SPDK logical volume have a couple of features that we will use:
 * clone, used to create new logical volume based on a snapshot. It can be used to revert a volume to a snapshot too, cloning a new volume, deleting the old one and then renaming the new one as the old one
 * decouple, feature that can be used to delete a snapshot, first decoupling the child volume from this snapshot and then deleting the snapshot.
 ### Replica rebuild
 RAID replica rebuild is actually under development, so we don'know exactly hot it will be implemented, but we can suppose that we will not use it because presumably it will work only at bdev layer.
 When a new replica has to be added or a replica has to be rebuilt, we have to recreate the entire snapshot stack of each volume that are hosted on that node. Actually SPDK doesn't have nothing to do that, but after discussing with core maintainers we arranged a procedure. Let's make an example.
 Supposing we have to rebuild a volume with two layer of snapshots, snapshotA is the oldest and snapshotB the younger, basically we have to (in _italic_ what we miss):
 * create a new volume on the node to be rebuilt
 * export this volume via NVMe-oF
 * attach to this volume in the node where we have the source data
 * _copy snapshotA over the attached volume_
 * perform a snapshot over the exported volume
 * repeat the copy and snapshot operations for snapshotB
 What we have to implement is a JSON-RPC to copy a logical volume over an arbitrary bdev (that in our case will represent a remote volume exported via NVMe-oF and locally attached) _while the top layer is also being modified_ (see next section).
 So, in this way we can rebuild the snapshot stack of a volume. But what about the live data? Actual `longhorn-engine` make the replica rebuild in an "hot" way, i.e., during the rebuilding phase it is writing over the live data of the new volume. So, how can we reproduce this with SPDK? First of all we have to wait the conclusion of RAID1 bdev's review process to see what kind of replica rebuild will be implemented. But, supposing that the rebuild feature will not be useful for us, we will need to create a couple of additional JSON-RPC over SPDK to implement the following procedure (in _italic_ what we miss):
 * create a new volume over the node to be rebuilt
 * export this volume via NVMe-oF
 * attach to this volume in the node where we have the RAID1
 * _add the bdev of the attached volume to the RAID1 bdev excluded from the read balancing_
 * wait for the snapshot stack rebuilding to finish
 * _change the upper volume of the snapshot stack from the current to this one with the live data_
 * _enable the bdev of the attached volume for RAID1 read balancing_
 What we have at the end of the rebuilding's phase is a snapshot stack with an empty volume at the top, while in the RAID1 we have a volume with the live data but without any snapshot. So we have to couple these 2 stacks exchanging the upper volume and to do that we need a new JSON-RPC. We will need to implement the JSON-RPC to enable/disable a bdev from the RAID1 read balancing too.
 ### Backup and Restore
 Backup will be implemented exporting a volume to a sparse file and then save this file over an external storage via S3. SPDK already has a `spdk_dd` application that can copy a bdev to a file and this app has an option to preserve bdev sparseness. But using spdk_dd has some problems: actually the sparse option works only with bdev that represent a local logical volume, not an exported one via NVMe-oF. So to backup a volume we cannot work on a remote node where to export this volume, we need to work on the node where we have the data source. But in this way, to perform a backup, we would need to stop the `spdk_tgt` app, run the `spdk_dd` and then restart the `spdk_tgt`. This operation is needed because it could not be safe to run multiple spdk applications over the same disk (even if spdk_dd would read from a read only volume) and moreover `spdk_dd` could not see the volume to export if this has been created after the last restart of `spdk_tgt` app. This because blobstore metadata, and so newly created logical volume, are saved on disk only on application exit.
 Stopping `spdk_tgt` is not acceptable because it would suspend operation over all other volumes hosted in this node so, to solve these problems, we have 2 possible solutions:
 * create a JSON-RPC command to export logical volume to a sparse file, so that we can make the operation directly over the `spdk_tgt` app
 * create a custom NVMe-oF command to implement the seek_data and seek_hole functionalities of bdev used by `spdk_dd` to skip holes
 With the second solution we could export the volume via NVMe-oF to a dedicated node where to perform the backup with `spdk_dd`application.
 The restore operation can be done in a couple of way:
 * read the backup sparse file and write its content into the longhorn block device. In this way data will be fully replicated
 * clone from backup over each replica, importing the backup sparse file into a new thinly provisioned logical volume. We can perform this operation over the local node, owner of the new volume, if for the backup process we choose to develop a JSON-RPC to export/import logical volume to/from sparse files. Otherwise we can do it or over a dedicated node with `spdk_dd` application, that handle sparse file with SEEK_HOLE and SEEK_DATA functionalities of `lseek`.
 If we leverage the same backup & restore mechanism of `longhorn-engine`, we can restore a backup done by the actual engine to a SPDK volume.
 ### Remote Control
 The JSON-RPC API by default is only available over the `/var/tmp/spdk.sock` Unix domain socket, but SPDK offer the sample python script [rpc_http_proxy](https://spdk.io/doc/jsonrpc_proxy.html) that provides http server which listens for JSON objects from users. Otherwise we could use the `socat` application to forward requests received from an IP socket towards a Unix socket. Both `socat` and `rpc_http_proxy` can perform user authentication with password.
 ### Upgrade Strategy
 What kind of upgrade/migration will we support?
 For out-of-cluster migration we can use the Restore procedure to create SPDK logical volumes starting from existing Longhorn files. Instead for in-cluster migration we can retain read support for the old format, writing new data over SPDK.
 Whatabout `spdk_tgt` updates, we can perform a rolling update strategy updating nodes one by one. Stopping `spdk_tgt` over a node will cause:
 * stop of all the volumes controlled in the node. To avoid service interruption the node must be evacuated before the update. The cheat is to delay the update until the node has to be rebooted for a kernel update.
 * stop of all the replicas hosted in the node. This is not a problem because during the update the I/O will be redirected towards other replica of the volume. To make a clean update of a node, before to stop `spdk_tgt`, we have to notify all the nodes that have a bdev imported via NMVMe-oF from this node to detach controllers involved.
 Moreover this is a good time to introduce backup versioning, which allows us to change/improve the backup format [REF: GH3175](https://github.com/longhorn/longhorn/issues/3175)
 ### Future Work
 * For Edge use cases, energy efficiency is important. We may need further enhancements and an interrupt-driven mode during low load periods for the scheduler. [Here](https://www.snia.org/educational-library/spdk-schedulers-%E2%80%93-saving-cpu-cores-polled-mode-storage-application-2021) an introduction to SPDK Schedulers that describes briefly the interrupt mode.
 ### Roadmap
 For Longhorn 1.5, we need to have the below capabilities:
 * replica (RAID1)
 * snapshot (create, delete/purge, revert)
 * replica rebuilding
 * volume clone
 For 1.6, we need the rest of the feature parity functions:
 * volume backup & restore
 * DR volume restore (incremental restore from another volume backup)
 * volume encryption
 * create volume from the backing image
 * create backing image from volume  
 * volume expansion
 * volume trim
 * volume metrics (bandwidth, latency, IOPS)
 * volume data integrity (snapshot checksum)
 SPDK uses a quarterly release cycle, next release will be 23.01 (January 2023). Assuming actual RAID1 implementation will be available in 23.01 release, actually the JSON-RPC we need to implement over SPDK are:
 * suspend I/O operation
 * copy a snapshot over an arbitrary bdev
 * add bdev to raid1 in read balancing disabled mode
 * enable/disable bdev in raid1 read balancing
 * export/import file to/from bdev or implement seek_data/hole in NVMe-oF
 The first development is necessary for snapshot, the last one for backup/restore and the other three developments are necessary for replica rebuilding.
 The snapshot copy has already been discussed with SPDK core maintainers, so an upstream development can be made.
 ### Limitations
 Actual RAID1 implementation is not still complete, so actually we have some limitations:
 * read balancing has been developed but is still under review, so it is available only in SPDK Gerrit
 * replica rebuild is still under development, so it isn't available. As a consequence of this, actually RAID1 miss the functionality to add a new base bdev to an existing RAID1 bdev.
--- a/enhancements/20230103-recurring-snapshot-cleanup.md
+++ b/enhancements/20230103-recurring-snapshot-cleanup.md
@ -0,0 +1,126 @@
 # Recurring Snapshot Cleanup
 ## Summary
 Currently, Longhorn's recurring job automatically cleans up older snapshots of volumes to retain no more than the defined snapshot number. However, this is limited to the snapshot created by the recurring job. For the non-recurring volume snapshots or snapshots created by backups, the user needs to clean them manually.
 Having periodic snapshot cleanup could help to delete/purge those extra snapshots regardless of the creation method.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/3836
 ## Motivation
 ### Goals
 Introduce new recurring job types:
 - `snapshot-delete`: periodically remove and purge all kinds of snapshots that exceed the retention count.
 - `snapshot-cleanup`: periodically purge removable or system snapshots.
 ### Non-goals [optional]
 `None`
 ## Proposal
 - Introduce two new `RecurringJobType`:
  - snapshot-delete
  - snapshot-cleanup
 - Recurring job periodically deletes and purges the snapshots for RecurringJob using the `snapshot-delete` task type. Longhorn will retain snapshots based on the given retain number.
 - Recurring job periodically purges the snapshots for RecurringJob using the `snapshot-cleanup` task type.
 ### User Stories
 - The user can create a RecurringJob with `spec.task=snapshot-delete` to instruct Longhorn periodically delete and purge snapshots.
 - The user can create a RecurringJob with `spec.task=snapshot-cleanup` to instruct Longhorn periodically purge removable or system snapshots.
 ### User Experience In Detail
 #### Recurring Snapshot Deletion
 1. Have some volume backups and snapshots.
 1. Create RecurringJob with the `snapshot-delete` task type.
   ```yaml
   apiVersion: longhorn.io/v1beta2
   kind: RecurringJob
   metadata:
     name: recurring-snap-delete-per-min
     namespace: longhorn-system
   spec:
     concurrency: 1
     cron: '* * * * *'
     groups: []
     labels: {}
     name: recurring-snap-delete-per-min
     retain: 2
     task: snapshot-delete
   ```
 1. Assign the RecurringJob to volume.
 1. Longhorn deletes all expired snapshots. As a result of the above example, the user will see two snapshots after the job completes.
 #### Recurring Snapshot Cleanup
 1. Have some system snapshots.
 1. Create RecurringJob with the `snapshot-cleanup` task type.
   ```yaml
   apiVersion: longhorn.io/v1beta2
   kind: RecurringJob
   metadata:
     name: recurring-snap-cleanup-per-min
     namespace: longhorn-system
   spec:
     concurrency: 1
     cron: '* * * * *'
     groups: []
     labels: {}
     name: recurring-snap-cleanup-per-min
     task: snapshot-cleanup
   ```
 1. Assign the RecurringJob to volume.
 1. Longhorn deletes all expired system snapshots. As a result of the above example, the user will see 0 system snapshot after the job completes.
 ### API changes
 `None`
 ## Design
 ### Implementation Overview
 #### The RecurringJob `snapshot-delete` Task Type
 1. List all expired snapshots (similar to the current `listSnapshotNamesForCleanup` implementation), and use as the [cleanupSnapshotNames](https://github.com/longhorn/longhorn-manager/blob/d20e1ca6e04b229b9823c1a941d865929007874c/app/recurring_job.go#L418) in `doSnapshotCleanup`.
 1. Continue with the current implementation to purge snapshots.
 #### The RecurringJob `snapshot-cleanup` Task Type
 1. Do snapshot purge only in `doSnapshotCleanup`.
 ### RecurringJob Mutate
 1. Mutate the `Recurringjob.Spec.Retain` to 0 when the task type is `snapshot-cleanup` since retain value has no effect on the purge.
 ### Test plan
 #### Test Recurring Snapshot Delete
 1. Create volume.
 1. Create 2 volume backups.
 1. Create 2 volume snapshots.
 1. Create a snapshot RecurringJob with the `snapshot-delete` task type.
 1. Assign the RecurringJob to volume.
 1. Wait until the recurring job is completed.
 1. Should see the number of snapshots matching the Recurring job `spec.retain`.
 #### Test Recurring Snapshot Cleanup
 1. Create volume.
 1. Create 2 volume system snapshots, ex: delete replica, online expansion.
 1. Create a snapshot RecurringJob with the `snapshot-cleanup` task type.
 1. Assign the RecurringJob to volume.
 1. Wait until the recurring job is completed.
 1. Should see the volume has 0 system snapshots.
 ### Upgrade strategy
 `None`
 ## Note [optional]
 `None`
--- a/enhancements/20230108-improve-backup-and-restore-efficiency-using-multiple-threads-and-compression-methods.md
+++ b/enhancements/20230108-improve-backup-and-restore-efficiency-using-multiple-threads-and-compression-methods.md
@ -0,0 +1,167 @@
 # Improve Backup and Restore Efficiency using Multiple Threads and Faster Compression Methods
 ## Summary
 Longhorn is capable of backing up or restoring volume in multiple threads and using more efficient compression methods for improving Recovery Time Objective (RTO).
 ### Related Issues
 [https://github.com/longhorn/longhorn/issues/5189](https://github.com/longhorn/longhorn/issues/5189)
 ## Motivation
 ### Goals
 - Support multi-threaded volume backup and restore.
 - Support efficient compression algorithm (`lz4`) and disable compression.
 - Support backward compatibility of existing backups compressed by `gzip`.
 ### Non-goals
 - Larger backup block size helps improve the backup efficiency more and decrease the block lookup operations. In the enhancement, the adaptive large backup block size is not supported and will be handled in https://github.com/longhorn/longhorn/issues/5215.
 ## Proposal
 1. Introduce multi-threaded volume backup and restore. Number of backup and restore threads are configurable by uses.
 2. Introduce efficient compression methods. By default, the compression method is `lz4`, and user can globally change it to `none` or `gzip`. Additionally, the per-volume compression method can be customized.
 3. Existing backups compressed by `gzip` will not be impacted. 
 ## User Stories
 Longhorn supports the backup and restore of volumes. Although the underlying computing and storage are powerful, the single thread implementation and low efficiency `gzip` compression method lead to slower backup and restore times and poor RTO.The enhancement aims to increase backup and restore efficiency through the use of multiple threads and efficient compression methods. The new parameters can be configured to accommodate a variety of applications and platforms, such as limiting the number of threads in an edge device or disabling compression for multimedia data.
 ### User Experience In Details
 - For existing volumes that already have backups, the compression method remains `gzip` for backward compatibility. Multi-threaded backups and restores are supported for subsequent backups.
 - By default, the global backup compression method is set to `lz4`. By editing the global setting `backup-compression-method`, users can configure the compression method to `none` or `gzip`. The backup compression method can be customized per volume by editing `volume.spec.backupCompressionMethod` for different data format in the volume.
 - Number of backup threads per backup is configurable by the global setting `backup-concurrent-limit`.
 - Number of restore threads per backup is configurable by the global setting `restore-concurrent-limit`.
 - Changing the compression method of a volume having backups is not supported.
 ### CLI Changes
 - Add `compression-method` to longhorn-engine binary `backup create` command.
 - Add `concurrent-limit` to longhorn-engine binary `backup create` command.
 - Add `concurrent-limit` to longhorn0engine binary `backup restore` command.
 ### API Changes
 - engine-proxy
    - Add `compressionMethod` and `concurrentLimit` to EngineSnapshotBackup method.
    - Add `concurrentLimit` to `EngineBackupRestore` method.
 - syncagent
    - Add `compressionMethod` and `concurrentLimit` to syncagent `BackupCreate` method.
    - Add `concurrentLimit` to syncagent `BackupRestore` method.
 ## Design
 ### Implementation Overview
 #### Global Settings
 - backup-compression-method
    - This setting allows users to specify backup compression method.
    - Options:
        - `none`: Disable the compression method. Suitable for multimedia data such as encoded images and videos.
        - `lz4`: Suitable for text files.
        - `gzip`: A bit of higher compression ratio but relatively slow. Not recommended.
    - Default: lz4
 - backup-concurrent-limit
    - This setting controls how many worker threads per backup job concurrently.
    - Default: 5
 - restore-concurrent-limit
    - This setting controls how many worker threads per restore job concurrently.
    - Default: 5
 #### CRDs
 1. Introduce `volume.spec.backupCompressionMethod`
 2. Introduce `backup.status.compressionMethod`
 #### Backup
 A producer-consumer pattern is used to achieve multi-threaded backups. In this implementation, there are one producer and multiple consumers which is controlled by the global setting `backup-concurrent-limit`.
 - Producer
    - Open the disk file to be backed up and create a `Block` channel.
    - Iterate the blocks in the disk file.
        - Skip sparse blocks.
        - Send the data blocks information including offset and size to `Block` channel.
        - Close the `Block` channel after finishing the iteration.
 - Consumers
    - Block handling goroutines (consumers) are created and consume blocks from the `Block` channel.
    - Processing blocks
        - Calculate the checksum of the incoming block.
        - Check the in-memory `processingBlocks` map to determine whether the block is being processed.
            - If YES, end up appending the block to `Blocks` that record the blocks processed in the backup.
        - If NO, check the remote backupstore to determine whether the block exists.
            - If YES, append the block to `Blocks`.
        - If NO, compress the block, upload the block, and end up appending it to the `Blocks`.
    - After the blocks have been consumed and the `Block` channel has been closed, the goroutines are terminated.
 Then, update the volume and backup metadata files in remote backupstore.
 #### Restore
 A producer-consumer pattern is used to achieve multi-threaded restores. In this implementation, there are one producer and multiple consumers which is controlled by the global setting `restore-concurrent-limit`.
 - Producer
    - Create a `Block` channel.
    - Open the backup metadata file and get the information, offset, size and checksum, of the blocks.
    - Iterate the blocks and send the block information to the `Block` channel.
    - Close the `Block` channel after finishing the iteration.
 - Consumers
    - Block handling goroutines (consumers) are created and consume blocks from the `Block` channel.
    - It is necessary for each consumer to open the disk file in order to avoid race conditions between the seek and write operations.
    - Read the block data from the backupstore, verify the data integrity and write to the disk file.
    - After the blocks have been consumed and the `Block` channel has been closed, the goroutines are terminated.
 ### Performance Benchmark
 In summary, the backup throughput is increased by 15X when using `lz4` and `10` concurrent threads in comparison with the backup in Longhorn v1.4.0. The restore (to a volume with 3 replica) throughput is increased by 140%, and the throughput is limited by the IO bound of the backupstore server.
 #### Setup
 | | |
 |---|---|
 | Platform | Equinix |
 | Host | Japan-Tokyo/m3.small.x86 |
 | CPU | Intel(R) Xeon(R) E-2378G CPU @ 2.80GHz |
 | RAM | 64 GiB |
 | Disk | Micron_5300_MTFD |
 | OS | Ubuntu 22.04.1 LTS(kernel 5.15.0-53-generic) |
 | Kubernetes | v1.23.6+rke2r2 |
 | Longhorn | master-branch + backup improvement |
 | Nodes | 3 nodes |
 | Backupstore target | external MinIO S3 (m3.small.x86) |
 | Volume | 50 GiB containing 1GB filesystem metadata and 10 GiB random data (3 replicas) |
 #### Results
 - Single-Threaded Backup and Restore by Different Compression Methods
    ![Single-Threaded Backup and Restore by Different Compression Methods](image/backup_perf/compression-methods.png)
 - Multi-Threaded Backup
    ![Multi-Threaded Backup](image/backup_perf/multi-threaded-backup.png)
 - Multi-Threaded Restore to One Volume with 3 Replicas
    ![Multi-Threaded Restore to One Volume with 3 Replicas](image/backup_perf/multi-threaded-restore-to-volume-3-replicas.png)
    Restore hit the IO bound of the backupstore server, because the throughput is saturated from 5 worker threads.
 - Multi-Threaded Restore to One Volume with 1 Replica
    ![Multi-Threaded Restore to One Volume with 1 Replica](image/backup_perf/multi-threaded-restore-to-volume-1-replica.png)
 ## Test Plan
 ### Integration Tests
 1. Create a volumes and then create backups using the compression method, `none`, `lz4` or `gzip` and different number of backup threads. The backups should succeed.
 2. Restore the backups created in step 1 by different number of restore threads. Verify the data integrity of the disk files.
--- a/enhancements/20230116-smb-cifs-backup-store-support.md
+++ b/enhancements/20230116-smb-cifs-backup-store-support.md
@ -0,0 +1,70 @@
 # SMB/CIFS Backup Store Support
 ## Summary
 Longhorn supports SMB/CIFS share as a backup storage.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/3599
 ## Motivation
 ### Goals
 - Support SMB/CIFS share as a backup storage.
 ## Proposal
 - Introduce SMB/CIFS client for supporting SMB/CIFS as a backup storage.
 ## User Stories
 Longhorn already supports NFSv4 and S3 servers as backup storage. However, certain users may encounter compatibility issues with their backup servers, particularly those running on Windows, as the protocols for NFSv4 and S3 are not always supported. To address this issue, the enhancement will enhance support for backup storage options with a focus on the commonly used SMB/CIFS protocol, which is compatible with both Linux and Windows-based servers.
 ### User Experience In Details
 - Check each Longhorn node's kernel supports the CIFS filesystem by
    ```
    cat /boot/config-`uname -r` | grep CONFIG_CIFS
    ```
 - Install the CIFS filesystem user-space tools `cifs-utils` on each Longhorn node
 - Users can configure a SMB/CIFS share as a backup storage
    - Set **Backup Target**. The path to a SMB/CIFS share is like
        ```bash
        cifs://${IP address}/${share name}
        ```
    - Set **Backup Target Credential Secret**
        - Create a secret and deploy it
            ```yaml
            apiVersion: v1
            kind: Secret
            metadata:
              name: cifs-secret
              namespace: longhorn-system
            type: Opaque
            data:
              CIFS_USERNAME: ${CIFS_USERNAME}
              CIFS_PASSWORD: ${CIFS_PASSWORD}
            ```
        - Set the setting **Backup Target Credential Secret** to `cifs-secret`
 ## Design
 ### Implementation Overview
 - longhorn-manager
    - Introduce the fields `CIFS_USERNAME` and `CIFS_PASSWORD` in credentials. The two fields are passed to engine and replica processes for volume backup and restore operations.
 - backupstore
    - Implement SMB/CIFS register/unregister and mount/unmount functions
 ### Test Plan
 ### Integration Tests
 1. Set a SMB/CIFS share as backup storage.
 2. Back up volumes to the backup storage and the operation should succeed.
 3. Restore backups and the operation should succeed.
--- a/enhancements/20230303-consolidate-instance-managers.md
+++ b/enhancements/20230303-consolidate-instance-managers.md
@ -0,0 +1,598 @@
 # Consolidate Longhorn Instance Managers
 ## Summary
 Longhorn architecture includes engine and replica instance manager pods on each node. After the upgrade, Longhorn adds an additional engine and replica instance manager pods. When the cluster is set with a default request of 12% guaranteed CPU, all instance manager pods will occupy 12% * 4 CPUs per node. Nevertheless, this caused high base resource requirements and is likely unnecessary.
 ```
 NAME                STATE      E-CPU(CORES)   E-MEM(BYTES)   R-CPU(CORES)   R-MEM(BYTES)   CREATED-WORKLOADS   DURATION(MINUTES)   AGE
 demo-0 (no-IO)      Complete   8.88m          24Mi           1.55m          43Mi           5                   10                  22h
 demo-0-bs-512b-5g   Complete   109.70m        66Mi           36.46m         54Mi           5                   10                  16h
 demo-0-bs-1m-10g    Complete   113.16m        65Mi           36.63m         56Mi           5                   10                  14h
 demo-0-bs-5m-10g    Complete   114.17m        64Mi           31.37m         54Mi           5                   10                  42m
 ```
 Aiming to simplify the architecture and free up some resource requests, this document proposes to consolidate the engine and replica instance managers into a single pod. This consolidation will not affect any data plane operations or volume migration. As the engine process is the primary consumer of CPU resources, merging the instance managers will result in a 50% reduction in CPU requests for instance managers. This is because there will only be one instance manager pod for both process types.
 ### Related Issues
 Phase 1:
 - https://github.com/longhorn/longhorn/issues/5208
 Phase 2:
 - https://github.com/longhorn/longhorn/issues/5842
 - https://github.com/longhorn/longhorn/issues/5844
 ## Motivation
 ### Goals
 - Having single instance manager pods to run replica and engine processes.
 - After the Longhorn upgrade, the previous engine instance manager should continue to handle data plane operations for attached volumes until they are detached. And the replica instance managers should continue servicing data plane operations until the volume engine is upgraded or volume is detached.
 - Automatically clean up any engine/replica instance managers when all instances (process) get removed.
 - Online/offline upgrade volume engine should be functional. The replicas will automatically migrate to use the new `aio` (all-in-one) type instance managers, and the `engine` type instance manager will continue to serve until the first volume detachment.
 - The Pod Disruption Budget (PDB) handling for cluster auto-scaler and node drain should work as expected.
 ### Non-goals [optional]
 `None`
 ## Proposal
 To ensure uninterrupted upgrades, this enhancement will be implemented in two phases. The existing `engine`/`replica` instance manager may coexist with the consolidated instance manager during the transition.
 Phase 1:
 - Introduce a new `aio` instance manager type. The `engine` and `replica` instance manager types will be deprecated and continue to serve for the upgraded volumes until the first volume detachment.
 - Introduce new `Guaranteed Instance Manager CPU` setting, `Guaranteed Engine Manager CPU` and `Guaranteed Replica Manager CPU` settings will be deprecated and continues to serve for the upgraded volumes until the first volume detachment.
 Phase 2:
 - Remove all instance manager types.
 - Remove the `Guaranteed Engine Manager CPU` and `Guaranteed Replica Manager CPU` settings.
 ### User Stories
 - For freshly installed Longhorn, the user will see `aio` type instance managers.
 - For upgraded Longhorn with all volume detached, the user will see the `engine`, and `replica` instance managers removed and replaced by `aio` type instance managers.
 - For upgraded Longhorn with volume attached, the user will see existing `engine`, and `replica` instance managers still servicing the old attached volumes and the new `aio` type instance manager servicing new volume attachments.
 ### User Experience In Detail
 #### New Installation
 1. User creates and attaches a volume.
   ```
   > kubectl -n longhorn-system get volume
   NAME     STATE      ROBUSTNESS   SCHEDULED   SIZE          NODE            AGE
   demo-0   attached   unknown                  21474836480   ip-10-0-1-113   12s
   > kubectl -n longhorn-system get lhim
   NAME                                                STATE     TYPE   NODE            AGE
   instance-manager-8f81ca7c3bf95bbbf656be6ac2d1b7c4   running   aio    ip-10-0-1-105   124m
   instance-manager-7e59c9f2ef7649630344050a8d5be68e   running   aio    ip-10-0-1-102   124m
   instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc   running   aio    ip-10-0-1-113   124m
   > kubectl -n longhorn-system get lhim/instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc -o yaml
   apiVersion: longhorn.io/v1beta2
   kind: InstanceManager
   metadata:
     creationTimestamp: "2023-03-16T10:48:59Z"
     generation: 1
     labels:
       longhorn.io/component: instance-manager
       longhorn.io/instance-manager-image: imi-8d41c3a4
       longhorn.io/instance-manager-type: aio
       longhorn.io/managed-by: longhorn-manager
       longhorn.io/node: ip-10-0-1-113
     name: instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc
     namespace: longhorn-system
     ownerReferences:
     - apiVersion: longhorn.io/v1beta2
       blockOwnerDeletion: true
       kind: Node
       name: ip-10-0-1-113
       uid: 00c0734b-f061-4b28-8071-62596274cb18
     resourceVersion: "926067"
     uid: a869def6-1077-4363-8b64-6863097c1e26
   spec:
     engineImage: ""
     image: c3y1huang/research:175-lh-im
     nodeID: ip-10-0-1-113
     type: aio
   status:
     apiMinVersion: 1
     apiVersion: 3
     currentState: running
     instanceEngines:
       demo-0-e-06d4c77d:
         spec:
           name: demo-0-e-06d4c77d
         status:
           endpoint: ""
           errorMsg: ""
           listen: ""
           portEnd: 10015
           portStart: 10015
           resourceVersion: 0
           state: running
           type: engine
     instanceReplicas:
       demo-0-r-ca78cab4:
         spec:
           name: demo-0-r-ca78cab4
         status:
           endpoint: ""
           errorMsg: ""
           listen: ""
           portEnd: 10014
           portStart: 10000
           resourceVersion: 0
           state: running
           type: replica
     ip: 10.42.0.238
     ownerID: ip-10-0-1-113
     proxyApiMinVersion: 1
     proxyApiVersion: 4
   ```
   - The engine and replica instances(processes) created in the `aio` type instance manager.
 #### Upgrade With Volumes Detached
 1. User has a Longhorn v1.4.0 cluster and a volume in the detached state.
   ```
   > kubectl -n longhorn-system get volume
   NAME     STATE      ROBUSTNESS   SCHEDULED   SIZE          NODE   AGE
   demo-1   detached   unknown                  21474836480          12s
   > kubectl -n longhorn-system get lhim
   NAME                                                  STATE     TYPE      NODE            AGE
   instance-manager-r-1278a39fa6e6d8f49eba156b81ac1f59   running   replica   ip-10-0-1-113   3m44s
   instance-manager-e-1278a39fa6e6d8f49eba156b81ac1f59   running   engine    ip-10-0-1-113   3m44s
   instance-manager-e-45ad195db7f55ed0a2dd1ea5f19c5edf   running   engine    ip-10-0-1-105   3m41s
   instance-manager-r-45ad195db7f55ed0a2dd1ea5f19c5edf   running   replica   ip-10-0-1-105   3m41s
   instance-manager-e-225a2c7411a666c8eab99484ab632359   running   engine    ip-10-0-1-102   3m42s
   instance-manager-r-225a2c7411a666c8eab99484ab632359   running   replica   ip-10-0-1-102   3m42s
   ```
 1. User upgraded Longhorn to v1.5.0.
   ```
   > kubectl -n longhorn-system get lhim
   NAME                                                STATE     TYPE   NODE            AGE
   instance-manager-8f81ca7c3bf95bbbf656be6ac2d1b7c4   running   aio    ip-10-0-1-105   112s
   instance-manager-7e59c9f2ef7649630344050a8d5be68e   running   aio    ip-10-0-1-102   48s
   instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc   running   aio    ip-10-0-1-113   47s
   ```
   - Unused `engine` type instance managers removed.
   - Unused `replica` type instance managers removed.
   - 3 `aio` type instance managers created.
 1. User upgraded volume engine.
 1. User attaches the volume.
   ```
   > kubectl -n longhorn-system get volume
   NAME     STATE      ROBUSTNESS   SCHEDULED   SIZE          NODE            AGE
   demo-1   attached   healthy                  21474836480   ip-10-0-1-113   4m51s
   > kubectl -n longhorn-system get lhim
   NAME                                                STATE     TYPE   NODE            AGE
   instance-manager-8f81ca7c3bf95bbbf656be6ac2d1b7c4   running   aio    ip-10-0-1-105   3m58s
   instance-manager-7e59c9f2ef7649630344050a8d5be68e   running   aio    ip-10-0-1-102   2m54s
   instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc   running   aio    ip-10-0-1-113   2m53s
   > kubectl -n longhorn-system get lhim/instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc -o yaml
   apiVersion: longhorn.io/v1beta2
   kind: InstanceManager
   metadata:
     creationTimestamp: "2023-03-16T13:03:15Z"
     generation: 1
     labels:
       longhorn.io/component: instance-manager
       longhorn.io/instance-manager-image: imi-8d41c3a4
       longhorn.io/instance-manager-type: aio
       longhorn.io/managed-by: longhorn-manager
       longhorn.io/node: ip-10-0-1-113
     name: instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc
     namespace: longhorn-system
     ownerReferences:
     - apiVersion: longhorn.io/v1beta2
       blockOwnerDeletion: true
       kind: Node
       name: ip-10-0-1-113
       uid: 12eb73cd-e9de-4c45-875d-3eff7cfb1034
     resourceVersion: "3762"
     uid: c996a89a-f841-4841-b69d-4218ed8d8c6e
   spec:
     engineImage: ""
     image: c3y1huang/research:175-lh-im
     nodeID: ip-10-0-1-113
     type: aio
   status:
     apiMinVersion: 1
     apiVersion: 3
     currentState: running
     instanceEngines:
       demo-1-e-b7d28fb3:
         spec:
           name: demo-1-e-b7d28fb3
         status:
           endpoint: ""
           errorMsg: ""
           listen: ""
           portEnd: 10015
           portStart: 10015
           resourceVersion: 0
           state: running
           type: engine
     instanceReplicas:
       demo-1-r-189c1bbb:
         spec:
           name: demo-1-r-189c1bbb
         status:
           endpoint: ""
           errorMsg: ""
           listen: ""
           portEnd: 10014
           portStart: 10000
           resourceVersion: 0
           state: running
           type: replica
     ip: 10.42.0.28
     ownerID: ip-10-0-1-113
     proxyApiMinVersion: 1
     proxyApiVersion: 4
   ```
   - The engine and replica instances(processes) created in the `aio` type instance manager.
 #### Upgrade With Volumes Attached
 1. User has a Longhorn v1.4.0 cluster and a volume in the attached state.
   ```
   > kubectl -n longhorn-system get volume
   NAME     STATE      ROBUSTNESS   SCHEDULED   SIZE          NODE            AGE
   demo-2   attached   healthy                  21474836480   ip-10-0-1-113   35s
   > kubectl -n longhorn-system get lhim
   NAME                                                  STATE     TYPE      NODE            AGE
   instance-manager-r-1278a39fa6e6d8f49eba156b81ac1f59   running   replica   ip-10-0-1-113   2m41s
   instance-manager-r-45ad195db7f55ed0a2dd1ea5f19c5edf   running   replica   ip-10-0-1-105   119s
   instance-manager-r-225a2c7411a666c8eab99484ab632359   running   replica   ip-10-0-1-102   119s
   instance-manager-e-1278a39fa6e6d8f49eba156b81ac1f59   running   engine    ip-10-0-1-113   2m41s
   instance-manager-e-225a2c7411a666c8eab99484ab632359   running   engine    ip-10-0-1-102   119s
   instance-manager-e-45ad195db7f55ed0a2dd1ea5f19c5edf   running   engine    ip-10-0-1-105   119s
   ```
 1. User upgraded Longhorn to v1.5.0.
   ```
   > kubectl -n longhorn-system get lhim
   NAME                                                  STATE     TYPE      NODE            AGE
   instance-manager-r-1278a39fa6e6d8f49eba156b81ac1f59   running   replica   ip-10-0-1-113   5m24s
   instance-manager-r-45ad195db7f55ed0a2dd1ea5f19c5edf   running   replica   ip-10-0-1-105   4m42s
   instance-manager-r-225a2c7411a666c8eab99484ab632359   running   replica   ip-10-0-1-102   4m42s
   instance-manager-e-1278a39fa6e6d8f49eba156b81ac1f59   running   engine    ip-10-0-1-113   5m24s
   instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc     running   aio       ip-10-0-1-113   117s
   instance-manager-7e59c9f2ef7649630344050a8d5be68e     running   aio       ip-10-0-1-102   33s
   instance-manager-8f81ca7c3bf95bbbf656be6ac2d1b7c4     running   aio       ip-10-0-1-105   32s
   ```
   - 2 unused `engine` type instance managers removed.
   - 3 `aio` type instance managers created.
 1. User upgraded online volume engine.
   ```
   > kubectl -n longhorn-system get lhim
   NAME                                                  STATE     TYPE     NODE            AGE
   instance-manager-8f81ca7c3bf95bbbf656be6ac2d1b7c4     running   aio      ip-10-0-1-105   6m53s
   instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc     running   aio      ip-10-0-1-113   8m18s
   instance-manager-7e59c9f2ef7649630344050a8d5be68e     running   aio      ip-10-0-1-102   6m54s
   instance-manager-e-1278a39fa6e6d8f49eba156b81ac1f59   running   engine   ip-10-0-1-113   11m
   ```
   - All `replica` type instance manager migrated to `aio` type instance managers.
 1. User detached the volume.
   ```
   > kubectl -n longhorn-system get lhim
   NAME                                                STATE     TYPE   NODE            AGE
   instance-manager-8f81ca7c3bf95bbbf656be6ac2d1b7c4   running   aio    ip-10-0-1-105   8m38s
   instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc   running   aio    ip-10-0-1-113   10m
   instance-manager-7e59c9f2ef7649630344050a8d5be68e   running   aio    ip-10-0-1-102   8m39s
   ```
   - The `engine` type instance managers removed.
 1. User attached the volume.
   ```
   > kubectl -n longhorn-system get volume
   NAME     STATE      ROBUSTNESS   SCHEDULED   SIZE          NODE            AGE
   demo-2   attached   healthy                  21474836480   ip-10-0-1-113   12m
   > kubectl -n longhorn-system get lhim
   NAME                                                STATE     TYPE   NODE            AGE
   instance-manager-7e59c9f2ef7649630344050a8d5be68e   running   aio    ip-10-0-1-102   9m40s
   instance-manager-8f81ca7c3bf95bbbf656be6ac2d1b7c4   running   aio    ip-10-0-1-105   9m39s
   instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc   running   aio    ip-10-0-1-113   11m
   > kubectl -n longhorn-system get lhim/instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc -o yaml
   apiVersion: longhorn.io/v1beta2
   kind: InstanceManager
   metadata:
     creationTimestamp: "2023-03-16T13:12:41Z"
     generation: 1
     labels:
       longhorn.io/component: instance-manager
       longhorn.io/instance-manager-image: imi-8d41c3a4
       longhorn.io/instance-manager-type: aio
       longhorn.io/managed-by: longhorn-manager
       longhorn.io/node: ip-10-0-1-113
     name: instance-manager-b34d5db1fe1e2d52bcfb308be3166cfc
     namespace: longhorn-system
     ownerReferences:
     - apiVersion: longhorn.io/v1beta2
       blockOwnerDeletion: true
       kind: Node
       name: ip-10-0-1-113
       uid: 6d109c40-abe3-42ed-8e40-f76cfc33e4c2
     resourceVersion: "4339"
     uid: 01556f2c-fbb4-4a15-a778-c73df518b070
   spec:
     engineImage: ""
     image: c3y1huang/research:175-lh-im
     nodeID: ip-10-0-1-113
     type: aio
   status:
     apiMinVersion: 1
     apiVersion: 3
     currentState: running
     instanceEngines:
       demo-2-e-65845267:
         spec:
           name: demo-2-e-65845267
         status:
           endpoint: ""
           errorMsg: ""
           listen: ""
           portEnd: 10015
           portStart: 10015
           resourceVersion: 0
           state: running
           type: engine
     instanceReplicas:
       demo-2-r-a2bd415f:
         spec:
           name: demo-2-r-a2bd415f
         status:
           endpoint: ""
           errorMsg: ""
           listen: ""
           portEnd: 10014
           portStart: 10000
           resourceVersion: 0
           state: running
           type: replica
     ip: 10.42.0.31
     ownerID: ip-10-0-1-113
     proxyApiMinVersion: 1
     proxyApiVersion: 4
   ```
   - The engine and replica instances(processes) created in the `aio` type instance manager.
 ### API changes
 - Introduce new `instanceManagerCPURequest` in `Node` resource.
 - Introduce new `instanceEngines` in InstanceManager resource.
 - Introduce new `instanceReplicas` in InstanceManager resource.
 ## Design
 ### Phase 1: All-in-one Instance Manager Implementation Overview
 Introducing a new instance manager type to have Longhorn continue to service existing attached volumes for Longhorn v1.5.x.
 #### New Instance Manager Type
 - Introduce a new `aio` (all-in-one) instance manager type to differentiate the handling of the old `engine`/`replica` instance managers and the new consolidated instance managers.
 - When getting InstanceManagers by instance of the attached volume, retrieve the InstanceManager from the instance manager list using the new `aio` type.
 #### InstanceManager `instances` Field Replacement For New InstanceManagers
 - New InstanceManagers will use the `instanceEngines` and `instanceReplicas` fields, replacing the `instances` field.
 - For the existing InstanceManagers for the attached Volumes, the `instances` field will remain in use.
 #### Instance Manager Execution
 - Rename the `engine-manager` script to `instance-manager`.
 - Bump up version to `4`.
 #### New Instance Manager Pod
 - Replace `engine` and `replica` pod creation with spec to use for `aio` instance manager pod.
  ```
  > kubectl -n longhorn-system get pod/instance-manager-0d96990c6881c828251c534eb31bfa85 -o yaml
  apiVersion: v1
  kind: Pod
  metadata:
    annotations:
      longhorn.io/last-applied-tolerations: '[]'
    creationTimestamp: "2023-03-01T08:13:03Z"
    labels:
      longhorn.io/component: instance-manager
      longhorn.io/instance-manager-image: imi-a1873aa3
      longhorn.io/instance-manager-type: aio
      longhorn.io/managed-by: longhorn-manager
      longhorn.io/node: ip-10-0-1-113
    name: instance-manager-0d96990c6881c828251c534eb31bfa85
    namespace: longhorn-system
    ownerReferences:
    - apiVersion: longhorn.io/v1beta2
      blockOwnerDeletion: true
      controller: true
      kind: InstanceManager
      name: instance-manager-0d96990c6881c828251c534eb31bfa85
      uid: 51c13e4f-d0a2-445d-b98b-80cca7080c78
    resourceVersion: "12133"
    uid: 81397cca-d9e9-48f6-8813-e7f2e2cd4617
  spec:
    containers:
    - args:
      - instance-manager
      - --debug
      - daemon
      - --listen
      - 0.0.0.0:8500
      env:
      - name: TLS_DIR
        value: /tls-files/
      image: c3y1huang/research:174-lh-im
      imagePullPolicy: IfNotPresent
      livenessProbe:
        failureThreshold: 3
        initialDelaySeconds: 3
        periodSeconds: 5
        successThreshold: 1
        tcpSocket:
          port: 8500
        timeoutSeconds: 4
      name: instance-manager
      resources:
        requests:
          cpu: 960m
      securityContext:
        privileged: true
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
      - mountPath: /host
        mountPropagation: HostToContainer
        name: host
      - mountPath: /engine-binaries/
        mountPropagation: HostToContainer
        name: engine-binaries
      - mountPath: /host/var/lib/longhorn/unix-domain-socket/
        name: unix-domain-socket
      - mountPath: /tls-files/
        name: longhorn-grpc-tls
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-hkbfc
        readOnly: true
    dnsPolicy: ClusterFirst
    enableServiceLinks: true
    nodeName: ip-10-0-1-113
    preemptionPolicy: PreemptLowerPriority
    priority: 0
    restartPolicy: Never
    schedulerName: default-scheduler
    securityContext: {}
    serviceAccount: longhorn-service-account
    serviceAccountName: longhorn-service-account
    terminationGracePeriodSeconds: 30
    tolerations:
    - effect: NoExecute
      key: node.kubernetes.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      operator: Exists
      tolerationSeconds: 300
    volumes:
    - hostPath:
        path: /
        type: ""
      name: host
    - hostPath:
        path: /var/lib/longhorn/engine-binaries/
        type: ""
      name: engine-binaries
    - hostPath:
        path: /var/lib/longhorn/unix-domain-socket/
        type: ""
      name: unix-domain-socket
    - name: longhorn-grpc-tls
      secret:
        defaultMode: 420
        optional: true
        secretName: longhorn-grpc-tls
    - name: kube-api-access-hkbfc
      projected:
        defaultMode: 420
        sources:
        - serviceAccountToken:
            expirationSeconds: 3607
            path: token
        - configMap:
            items:
            - key: ca.crt
              path: ca.crt
            name: kube-root-ca.crt
        - downwardAPI:
            items:
            - fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
              path: namespace
  status:
    conditions:
    - lastProbeTime: null
      lastTransitionTime: "2023-03-01T08:13:03Z"
      status: "True"
      type: Initialized
    - lastProbeTime: null
      lastTransitionTime: "2023-03-01T08:13:04Z"
      status: "True"
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: "2023-03-01T08:13:04Z"
      status: "True"
      type: ContainersReady
    - lastProbeTime: null
      lastTransitionTime: "2023-03-01T08:13:03Z"
      status: "True"
      type: PodScheduled
    containerStatuses:
    - containerID: containerd://cb249b97d128e47a7f13326b76496656d407fd16fc44b5f1a37384689d0fa900
      image: docker.io/c3y1huang/research:174-lh-im
      imageID: docker.io/c3y1huang/research@sha256:1f4e86b92b3f437596f9792cd42a1bb59d1eace4196139dc030b549340af2e68
      lastState: {}
      name: instance-manager
      ready: true
      restartCount: 0
      started: true
      state:
        running:
          startedAt: "2023-03-01T08:13:03Z"
    hostIP: 10.0.1.113
    phase: Running
    podIP: 10.42.0.27
    podIPs:
    - ip: 10.42.0.27
    qosClass: Burstable
    startTime: "2023-03-01T08:13:03Z"
  ```
 #### Controllers Change
 - Map the status of the engine/replica process to the corresponding instanceEngines/instanceReplicas fields in the InstanceManager instead of the instances field. To ensure backward compatibility, the instances field will continue to be utilized by the pre-upgrade attached volume.
 - Ensure support for the previous version's attached volumes with the old engine/replica instance manager types.
 - Replace the old engine/replica InstanceManagers with the aio type instance manager during replenishment.
 #### New Setting
 - Introduce a new `Guaranteed Instance Manager CPU` setting for the new `aio` instance manager pod.
 - The `Guaranteed Engine Manager CPU` and `Guaranteed Replica Manager CPU` will co-exist with this setting in Longhorn v1.5.x.
 ### Phase 2 - Deprecations Overview
 Based on the assumption when upgrading from v1.5.x to 1.6.x, volumes should have detached at least once and migrated to `aio` type instance managers. Then the cluster should not have volume depending on `engine` and `replica` type instance managers. Therefore in this phase, remove the related types and settings.
 #### Old Instance Manager Types
 - Remove the `engine`, `replica`, and `aio` instance manager types. There is no need for differentiation.
 ### Old Settings
 - Remove the `Guaranteed Engine Manager CPU` and `Guaranteed Replica Manager CPU` settings. The settings have already been replaced by the `Guaranteed Instance Manager CPU` setting in phase 1.
 #### Controllers Change
 - Remove support for engine/replica InstanceManager types.
 ### Test plan
 Support new `aio` instance manager type and run regression test cases.
 ### Upgrade strategy
 The `instances` field in the instance manager custom resource will still be utilized by old instance managers of the attached volume.
 ## Note [optional]
 `None`
--- a/enhancements/20230307-pdb-for-longhon-csi-and-webhook.md
+++ b/enhancements/20230307-pdb-for-longhon-csi-and-webhook.md
@ -0,0 +1,77 @@
 # Use PDB to protect Longhorn components from drains
 ## Summary
 Some Longhorn components should be available to correctly handle cleanup/detach Longhorn volumes during the draining process.
 They are: `csi-attacher`, `csi-provisioner`, `longhorn-admission-webhook`, `longhorn-conversion-webhook`, `share-manager`, `instance-manager`, and daemonset pods in `longhorn-system` namespace.
 This LEP outlines our existing solutions to protect these components, the issues of these solutions, and the proposal for improvement.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/3304
 ## Motivation
 ### Goals
 1. Have better ways to protect Longhorn components (`csi-attacher`, `csi-provisioner`, `longhorn-admission-webhook`, `longhorn-conversion-webhook`) without demanding the users to specify the draining flags to skip these pods.
 ## Proposal
 1. Our existing solutions to protect these components are:
   * For `instance-manager`: dynamically create/delete instance manager PDB
   * For Daemonset pods in `longhorn-system` namespace: we advise the users to specify `--ignore-daemonsets` to ignore them in the `kubectl drain` command. This actually follows the [best practice](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/#:~:text=If%20there%20are%20pods%20managed%20by%20a%20DaemonSet%2C%20you%20will%20need%20to%20specify%20%2D%2Dignore%2Ddaemonsets%20with%20kubectl%20to%20successfully%20drain%20the%20node)
   * For `csi-attacher`, `csi-provisioner`, `longhorn-admission-webhook`, and `longhorn-conversion-webhook`: we advise the user to specify `--pod-selector` to ignore these pods
 1. Proposal for `csi-attacher`, `csi-provisioner`, `longhorn-admission-webhook`, and `longhorn-conversion-webhook`: <br>
   The problem with the existing solution is that sometime, users could not specify `--pod-selector` for the `kubectl drain` command.
   For example, for the users that are using the project [System Upgrade Controller](https://github.com/rancher/system-upgrade-controller), they don't have option to specify `--pod-selector`.
   Also, we would like to have a more automatic way instead of relying on the user to set kubectl drain options.
   Therefore, we propose the following design:
    * Longhorn manager automatically create PDBs for `csi-attacher`, `csi-provisioner`, `longhorn-admission-webhook`, and `longhorn-conversion-webhook` with `minAvailable` set to 1.
      This will make sure that each of these deployment has at least 1 running pod during the draining process.
    * Longhorn manager continuously watches the volumes and removes the PDBs once there is no attached volume.
   This should work for both single-node and multi-node cluster.
 ### User Stories
 #### Story 1
 Before the enhancement, users would need to specify the drain options for drain command to exclude Longhorn pods.
 Sometimes, this is not possible when users use third-party solution to drain and upgrade kubernetes, such as System Upgrade Controller.
 #### Story 2
 ### User Experience In Detail
 After the enhancement, the user can doesn't need to specify the drain options for the drain command to exclude Longhorn pods.
 ### API changes
 None
 ## Design
 ### Implementation Overview
 Create a new controller inside Longhorn manager called `longhorn-pdb-controller`, the controller listens for the changes for
 `csi-attacher`, `csi-provisioner`, `longhorn-admission-webhook`, `longhorn-conversion-webhook`, and Longhorn volumes to adjust the PDB correspondingly.
 ### Test plan
 https://github.com/longhorn/longhorn/issues/3304#issuecomment-1467174481
 ### Upgrade strategy
 No Upgrade is needed
 ## Note
 In the original Github ticket, we mentioned that we need to add PDB to protect share manager pod from being drained before its workload pods because if share manager pod doesn't exist then its volume cannot be unmounted in the CSI flow.
 However, with the fix https://github.com/longhorn/longhorn/issues/5296, we can always umounted the volume even if the share manager is not running.
 Therefore, we don't need to protect share manager pod.
--- a/enhancements/20230309-recurring-filesystem-trim.md
+++ b/enhancements/20230309-recurring-filesystem-trim.md
@ -0,0 +1,86 @@
 # Recurring Filesystem Trim
 ## Summary
 Longhorn currently supports the [filesystem trim](./20221103-filesystem-trim.md) feature, which allows users to reclaim volume disk spaces of deleted files. However, this is a manual process, which can be time-consuming and inconvenient.
 To improve user experience, Longhorn could automate the process by implementing a new RecurringJob `filesystem-trim` type. This enhancement enables regularly freeing up unused volume spaces and reducing the need for manual interventions.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/5186
 ## Motivation
 ### Goals
 Introduce a new recurring job type called `filesystem-trim` to periodically trim the volume filesystem to reclaim disk spaces.
 ### Non-goals [optional]
 `None`
 ## Proposal
 To extend the RecurringJob custom resource definition by adding new `RecurringJobType: filesystem-trim`.
 ### User Stories
 To schedule regular volume filesystem trims, user can create a RecurringJob with `spec.task=filesystem-trim` and associating it with volumes.
 ### User Experience In Detail
 #### Recurring Filesystem Trim
 1. The user sees workload volume size has increased over time.
 1. Create RecurringJob with the `filesystem-trim` task type and assign it to the volume.
   ```yaml
   apiVersion: longhorn.io/v1beta2
   kind: RecurringJob
   metadata:
     name: recurring-fs-trim-per-min
     namespace: longhorn-system
   spec:
     concurrency: 1
     cron: '* * * * *'
     groups: []
     labels: {}
     name: recurring-fs-trim-per-min
     retain: 0
     task: filesystem-trim
   ```
 1. The RecurringJob runs and relaims some volume spaces.
 ### API changes
 `None`
 ## Design
 ### Implementation Overview
 #### The RecurringJob `filesystem-trim` Task Type
 1. Call Volume API `ActionTrimFilesystem` when the RecurringJob type is `filesystem-trim`.
 ### RecurringJob Mutate
 1. Mutate the `Recurringjob.Spec.Retain` to 0 when the task type is `filesystem-trim` as it is not effective for this type of task.
 ### Test plan
 #### Test Recurring Filesystem Trim
 1. Create workload.
 1. Create a file with some data in the workload.
 1. Volume actual size should increase.
 1. Delete the file.
 1. Volume actual size should not decrease.
 1. Create RecurringJob with type `filesystem-trim` and assign to the workload volume.
 1. Wait for RecurringJob to complete.
 1. Volume actual size should decrease.
 ### Upgrade strategy
 `None`
 ## Note [optional]
 `None`
--- a/enhancements/20230315-upgrade-path-enforcement.md
+++ b/enhancements/20230315-upgrade-path-enforcement.md
@ -0,0 +1,269 @@
 # Upgrade Path Enforcement
 ## Summary
 Currently, Longhorn does not enforce the upgrade path, even though we claim Longhorn only supports upgrading from the previous stable release, for example, upgrading to 1.5.x is only supported from 1.4.x or 1.5.0.
 Without upgrade enforcement, we will allow users to upgrade from any previous version. This will cause extra testing efforts to cover all upgrade paths. Additionally, the goal of this enhancement is to support rollback after upgrade failure and prevent downgrades.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/5131
 ## Motivation
 ### Goals
 - Enforce an upgrade path to prevent users from upgrading from any unsupported version. After rejecting the user's upgrade attempt, the user's Longhorn setup should remain intact without any impacts.
 - Upgrade Longhorn from the authorized versions to a major release version.
 - Support rollback the failed upgrade to the previous version.
 - Prevent unexpected downgrade.
 ### Non-goals
 - Automatic rollback if the upgrade failed.
 ## Proposal
 - When upgrading with `kubectl`, it will check the upgrade path at entry point of the pods for `longhorn-manager`, `longhorn-admission-webhook`, `longhorn-conversion-webhook` and `longhorn-recovery-backend`.
 - When upgrading with `Helm` or as a `Rancher App Marketplace`, it will check the upgrade path by a `pre-upgrade` job of `Helm hook`
 ### User Stories
 - As the admin, I want to upgrade Longhorn from x.y.* or x.(y+1).0 to x.(y+1).* by `kubectl`, `Helm` or `Rancher App Marketplace`, so that the upgrade should succeed.
 - As the admin, I want to upgrade Longhorn from the previous authorized versions to a new major/minor version by `kubectl`, `Helm`, or `Rancher App Marketplace`, so that the upgrade should succeed.
 - As the admin, I want to upgrade Longhorn from x.(y-1).* to x.(y+1).* by 'kubectl', 'Helm' or 'Rancher App Marketplace', so that the upgrade should be prevented and the system with the current version continues running w/o any interruptions.
 - As the admin, I want to roll back Longhorn from the failed upgrade to the previous install by `kubectl`, `Helm`, or `Rancher App Marketplace`, so that the rollback should succeed.
 - As the admin, I want to downgrade Longhorn to any lower version by `kubectl`, `Helm`, or `Rancher App Marketplace`, so that the downgrade should be prevented and the system with the current version continues running w/o any interruptions.
 ### User Experience In Detail
 #### Upgrade Longhorn From x.y.* or x.(y+1).0 To x.(y+1).*
 ##### Upgrade With `kubectl`
 1. Install Longhorn on any Kubernetes cluster by using this command:
   ```shell
   kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.y.*/deploy/longhorn.yaml
   ```
   or
   ```shell
   kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.(y+1).0/deploy/longhorn.yaml
   ```
 1. After Longhorn works normally, upgrade Longhorn by using this command:
    ```shell
    kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.(y+1).*/deploy/longhorn.yaml
    ```
 1. It will be allowed and Longhorn will be upgraded successfully.
 ##### Upgrade With `Helm` Or `Rancher App Marketplace`
 1. Install Longhorn x.y.* or x.(y+1).0 with Helm as [Longhorn Install with Helm document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-helm/) or install Longhorn x.y.* or x.(y+1).0 with a Rancher Apps as [Longhorn Install as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-rancher/)
 1. Upgrade to Longhorn x.(y+1).* with Helm as [Longhorn Upgrade with Helm document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-with-helm) or upgrade to Longhorn x.(y+1).* with a Rancher Catalog App as [Longhorn Upgrade as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-as-a-rancher-catalog-app)
 1. It will be allowed and Longhorn will be upgraded successfully.
 #### Upgrade Longhorn From The Authorized Versions To A Major Release Version
 ##### Upgrade With `kubectl`
 1. Install Longhorn on any Kubernetes cluster by using this command:
   ```shell
   kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.y.*/deploy/longhorn.yaml
   ```
 1. After Longhorn works normally, upgrade Longhorn by using this command:
    ```shell
    kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v(x+1).0.*/deploy/longhorn.yaml
    ```
 1. It will be allowed and Longhorn will be upgraded successfully.
 ##### Upgrade With `Helm` Or `Rancher App Marketplace`
 1. Install Longhorn x.y.* with Helm such as [Longhorn Install with Helm document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-helm/) or install Longhorn x.y.* with a Rancher Apps as [Longhorn Install as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-rancher/)
 1. Upgrade to Longhorn (x+1).0.* with Helm as [Longhorn Upgrade with Helm document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-with-helm) or upgrade to Longhorn (x+1).0.* with a Rancher Catalog App as [Longhorn Upgrade as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-as-a-rancher-catalog-app)
 1. It will be allowed and Longhorn will be upgraded successfully.
 #### Upgrade Longhorn From x.(y-1).* To x.(y+1).*
 ##### Upgrade With `kubectl`
 1. Install Longhorn on any Kubernetes cluster by using this command:
    ```shell
    kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.(y-1).*/deploy/longhorn.yaml
    ```
 1. After Longhorn works normally, upgrade Longhorn by using this command:
    ```shell
    kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.(y+1).*/deploy/longhorn.yaml
    ```
 1. It will be not allowed and Longhorn will block the upgrade for `longhorn-manager`, `longhorn-admission-webhook`, `longhorn-conversion-webhook` and `longhorn-recovery-backend`.
 1. Users need to roll back Longhorn manually to restart `longhorn-manager` pods.
 ##### Upgrade With `Helm` Or `Rancher App Marketplace`
 1. Install Longhorn x.(y-1).* with Helm as [Longhorn Install with Helm document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-helm/) or install Longhorn x.(y-1).* with a Rancher Apps as [Longhorn Install as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-rancher/)
 1. Upgrade to Longhorn x.(y+1).* with Helm as [Longhorn Upgrade with Helm document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-with-helm) or upgrade to Longhorn x.(y+1).* with a Rancher Catalog App as [Longhorn Upgrade as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-as-a-rancher-catalog-app)
 1. It will not be allowed and a `pre-upgrade`job of `Helm hook` failed makes the whole helm upgrading process failed.
 1. Longhorn is intact and continues serving.
 #### Roll Back Longhorn From The Failed Upgrade To The Previous Install
 ##### Roll Back With `kubectl`
 1. Users need to recover Longhorn by using this command again:
   ```shell
   kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/[previous installed version]/deploy/longhorn.yaml
   ```
 1. Longhorn will be rolled back successfully.
 1. And users might need to delete new components introduced by new version Longhorn manually.
 ##### Roll Back With `Helm` Or `Rancher App Marketplace`
 1. Users need to recover Longhorn with `Helm` by using commands:
    ```shell
    helm history longhorn # to get previous installed Longhorn REVISION
    helm rollback longhorn [REVISION]
    ```
    or
    ```shell
    helm upgrade longhorn longhorn/longhorn --namespace longhorn-system --version [previous installed version]
    ```
 1. Users need to recover Longhorn with `Rancher Catalog Apps` by upgrading the previous installed Longhorn version at `Rancher App Marketplace` again.
 1. Longhorn will be rolled back successfully.
 ##### Manually Cleanup Example
 When users try to upgrade Longhorn from v1.3.x to v1.5.x, a new deployment `longhorn-recovery-backend` will be introduced and the upgrade will fail.
 Users need to delete the deployment `longhorn-recovery-backend` manually after rolling back Longhorn
 #### Downgrade Longhorn To Any Lower Version
 ##### Downgrade With `kubectl`
 1. Install Longhorn on any Kubernetes cluster by using this command:
    ```shell
    kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.y.*/deploy/longhorn.yaml
    ```
 1. After Longhorn works normally, upgrade Longhorn by using this command:
    ```shell
    kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/vx.(y-z).*/deploy/longhorn.yaml
    ```
 1. It will be not allowed and Longhorn will block the downgrade for `longhorn-manager`. [or `longhorn-admission-webhook`, `longhorn-conversion-webhook` and `longhorn-recovery-backend` if downgrading version had these components]
 1. Users need to roll back Longhorn manually to restart `longhorn-manager` pods.
 ##### Downgrade With `Helm` Or `Rancher App Marketplace`
 1. Install Longhorn x.y.* with Helm as [Longhorn Install with Helm document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-helm/) or install Longhorn x.y.* with a Rancher Apps as [Longhorn Install as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/install/install-with-rancher/)
 1. Downgrade to Longhorn (x-z).y.* or x.(y-z).* with Helm as [Longhorn Upgrade with Helm document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-with-helm) or downgrade to Longhorn (x-z).y.* or x.(y-z).* with a Rancher Catalog App as [Longhorn Upgrade as a Rancher Apps & Marketplace document](https://longhorn.io/docs/1.4.1/deploy/upgrade/longhorn-manager/#upgrade-as-a-rancher-catalog-app)
 1. It will not be allowed and a `pre-upgrade`job of `Helm hook` failed makes the whole helm downgrading process failed.
 1. Longhorn is intact and continues serving.
 ### API changes
 `None`
 ## Design
 ### Implementation Overview
 #### Blocking Upgrade With `kubectl`
 Check the upgrade path is supported or not at entry point of the `longhorn-manager`, `longhorn-admission-webhook`, `longhorn-conversion-webhook` and `longhorn-recovery-backend`
 1. Get Longhorn current version `currentVersion` by the function `GetCurrentLonghornVersion`
 1. Get Longhorn upgrading version `upgradeVersion` from `meta.Version`
 1. Compare currentVersion and upgradeVersion, only allow authorized version upgrade (e.g., 1.3.x to 1.5.x is not allowed) as following table.
  |  currentVersion |  upgradeVersion |  Allow |
  |    :-:      |    :-:      |   :-:  |
  |  x.y.*      |  x.(y+1).*  |   ✓    |
  |  x.y.0      |  x.y.*      |   ✓    |
  |  x.y.*      |  (x+1).y.*  |   ✓    |
  |  x.(y-1).*  |  x.(y+1).*  |   X    |
  |  x.(y-2).*  |  x.(y+1).*  |   X    |
  |  x.y.*      |  x.(y-1).*  |   X    |
  |  x.y.*      |  x.y.(*-1)  |   X    |
 1. Downgrade is not allowed.
 2. When the upgrade path is not supported, new created pods of the `longhorn-manager`, `longhorn-admission-webhook`, `longhorn-conversion-webhook` and `longhorn-recovery-backend` will show logs and broadcast events for the upgrade path is not supported and return errors.
 3. Previous installed Longhorn will work normally still.
 #### Blocking Upgrade With `Helm` Or `Rancher App Marketplace`
 1. Add a new job for pre-upgrade hook of `Helm` as the [`post-upgrade` job](https://github.com/longhorn/longhorn/blob/master/chart/templates/postupgrade-job.yaml).
 ```txt
 apiVersion: batch/v1
 kind: Job
 metadata:
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation,hook-failed
  name: longhorn-pre-upgrade
  ...
 spec:
  ...
  template:
    metadata:
      name: longhorn-pre-upgrade
      ...
    spec:
      containers:
      - name: longhorn-post-upgrade
        ...
        command:
        - longhorn-manager
        - pre-upgrade
        env:
        ...
 ```
 1. When upgrading starts, the `pre-upgrade` job will start to run firstly and it will be failed if the upgrade path is not supported then `Helm` upgrading process will be failed.
 ### Test plan
 #### Test Supported Upgrade Path
 1. Install Longhorn v1.4.x.
 1. Wait for all pods ready.
 1. Create a Volume and write some data.
 1. Upgrade to Longhorn v1.5.0.
 1. Wait for all pods upgraded successfully.
 1. Check if data is not corrupted.
 #### Test Unsupported Upgrade Path
 1. Install Longhorn v1.3.x.
 1. Wait for all pods ready.
 1. Create a Volume and write some data.
 1. Upgrade to Longhorn v1.5.0.
 1. Upgrading process will be stuck or failed.
 1. Check if data is not corrupted.
 1. Rollback to Longhorn v1.3.x with the same setting.
 1. Longhorn v1.3.x will work normally.
 ### Upgrade strategy
 `None`
 ## Note
 `None`
--- a/enhancements/20230417-extend-csi-snapshot-to-support-backingimage.md
+++ b/enhancements/20230417-extend-csi-snapshot-to-support-backingimage.md
@ -0,0 +1,532 @@
 # Title
 Extend CSI snapshot to support Longhorn BackingImage
 ## Summary
 In Longhorn, we have BackingImage for VM usage. We would like to extend the CSI Snapshotter to support BackingImage management.
 ### Related Issues
 [BackingImage Management via VolumeSnapshot #5005](https://github.com/longhorn/longhorn/issues/5005)
 ## Motivation
 ### Goals
 Extend the CSI snapshotter to support:
 - Create Longhorn BackingImage
 - Delete Longhorn BackingImage
 - Creating a new PVC from CSI snapshot that is associated with a Longhorn BackingImage
 ### Non-goals [optional]
 - Can support COW over each relative base image for delta data transfer for better space efficiency. (Will be in next improvement)
 - User can backup a BackingImage based volume and restore it in another cluster without manually preparing BackingImage in a new cluster.
 ## Proposal
 ### User Story
 With this improvement, users can use standard CSI VolumeSnapshot as the unified interface for BackingImage creation, deletion and restoration of a Volume.
 ### User Experience In Detail
 To use this feature, users need to deploy the CSI snapshot CRDs and related Controller
 1. The instructions are already on our document: https://longhorn.io/docs/1.4.1/snapshots-and-backups/csi-snapshot-support/enable-csi-snapshot-support/
 2. Create a VolumeSnapshotClass with type `bi` which refers to BackingImage
    ```yaml
    kind: VolumeSnapshotClass
    apiVersion: snapshot.storage.k8s.io/v1
    metadata:
      name: longhorn-snapshot-vsc
    driver: driver.longhorn.io
    deletionPolicy: Delete
    parameters:
      type: bi
      export-type: qcow2 # default to raw if it is not provided
    ```
 #### BackingImage creation via VolumenSnapshot resource
 Users can create a BackingImage of a Volume by creation of VolumeSnapshot. Example below for a Volume named `test-vol`
 ```yaml
 apiVersion: snapshot.storage.k8s.io/v1beta1
 kind: VolumeSnapshot
 metadata:
  name: test-snapshot-backing
 spec:
  volumeSnapshotClassName: longhorn-snapshot-vsc
  source:
    persistentVolumeClaimName: test-vol
 ```
 Longhorn will create a BackingImage **exported** from this Volume.
 #### Restoration via VolumeSnapshot resource
 Users can create a volume based on a prior created VolumeSnapshot. Example below for a Volume named `test-vol-restore`
 ```yaml
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
  name: test-vol-restore
 spec:
  storageClassName: longhorn
  dataSource:
    name: test-snapshot-backing
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
 ```
 Longhorn will create a Volume based on the BackingImage associated with the VolumeSnapshot.
 #### Restoration of an existing BackingImage (pre-provisioned)
 Users can request the creation of a Volume based on a prior BackingImage which was not created via the CSI VolumeSnapshot.
 With the BackingImage already existing, users need to create the VolumeSnapshotContent with an associated VolumeSnapshot. The `snapshotHandle` of the VolumeSnapshotContent needs to point to an existing BackingImage. Example below for a Volume named `test-restore-existing-backing` and an existing BackingImage `test-bi`
 - For pre-provisioning, users need to provide following query parameters:
    - `backingImageDataSourceType`: `sourceType` of existing BackingImage, e.g. `export-from-volume`, `download`
    - `backingImage`: Name of the BackingImage
    - you should also provide the `sourceParameters` of existing BackingImage in the `snapshotHandle` for validation.
      - `export-from-volume`: you should provide
        - `volume-name`
        - `export-type`
      - `download`: you should proviide
        - `url`
        - `checksum`: optional
 ```yaml
 apiVersion: snapshot.storage.k8s.io/v1
 kind: VolumeSnapshotContent
 metadata:
  name: test-existing-backing
 spec:
  volumeSnapshotClassName: longhorn-snapshot-vsc
  driver: driver.longhorn.io
  deletionPolicy: Delete
  source:
    # NOTE: change this to point to an existing BackingImage in Longhorn
    snapshotHandle: bi://backing?backingImageDataSourceType=export-from-volume&backingImage=test-bi&volume-name=vol-export-src&export-type=qcow2
  volumeSnapshotRef:
    name: test-snapshot-existing-backing
    namespace: default
 ```
 ```yaml
 apiVersion: snapshot.storage.k8s.io/v1beta1
 kind: VolumeSnapshot
 metadata:
  name: test-snapshot-existing-backing
 spec:
  volumeSnapshotClassName: longhorn-snapshot-vsc
  source:
    volumeSnapshotContentName: test-existing-backing
 ```
 ```yaml
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
  name: test-restore-existing-backing
 spec:
  storageClassName: longhorn
  dataSource:
    name: test-snapshot-existing-backing
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
 ```
 Longhorn will create a Volume based on the BackingImage associated with the VolumeSnapshot and the VolumeSnapshotContent.
 #### Restoration of a non-existing BackingImage (on-demand provision)
 Users can request the creation of a Volume based on a BackingImage which was not created yet with following 2 kinds of data sources.
 1. `download`: Download a file from a URL as a BackingImage.
 2. `export-from-volume`: Export an existing in-cluster volume as a backing image.
 Users need to create the VolumeSnapshotContent with an associated VolumeSnapshot. The `snapshotHandle` of the VolumeSnapshotContent needs to provide the parameters for the data source. Example below for a volume named `test-on-demand-backing` and an non-existing BackingImage `test-bi` with two different data sources.
 1. `download`: Users need to provide following parameters
    - `backingImageDataSourceType`: `download` for on-demand download.
    - `backingImage`: Name of the BackingImage
    - `url`: The file from a URL as a BackingImage.
    - `backingImageChecksum`: Optional. Used for checking the checksum of the file.
    - example yaml:
        ```yaml
        apiVersion: snapshot.storage.k8s.io/v1
        kind: VolumeSnapshotContent
        metadata:
            name: test-on-demand-backing
        spec:
            volumeSnapshotClassName: longhorn-snapshot-vsc
            driver: driver.longhorn.io
            deletionPolicy: Delete
            source:
              # NOTE: change this to provide the correct parameters
              snapshotHandle: bi://backing?backingImageDataSourceType=download&backingImage=test-bi&url=https%3A%2F%2Flonghorn-backing-image.s3-us-west-1.amazonaws.com%2Fparrot.qcow2&backingImageChecksum=bd79ab9e6d45abf4f3f0adf552a868074dd235c4698ce7258d521160e0ad79ffe555b94e7d4007add6e1a25f4526885eb25c53ce38f7d344dd4925b9f2cb5d3b
        volumeSnapshotRef:
            name: test-snapshot-on-demand-backing
            namespace: default
        ```
 2. `export-from-volume`: Users need to provide following parameters
    - `backingImageDataSourceType`: `export-form-volume` for on-demand export.
    - `backingImage`: Name of the BackingImage
    - `volume-name`: Volume to be exported for the BackingImage
    - `export-type`: Currently Longhorn supports `raw` or `qcow2`
    - example yaml:
        ```yaml
        apiVersion: snapshot.storage.k8s.io/v1
        kind: VolumeSnapshotContent
        metadata:
        name: test-on-demand-backing
        spec:
        volumeSnapshotClassName: longhorn-snapshot-vsc
        driver: driver.longhorn.io
        deletionPolicy: Delete
        source: 
            # NOTE: change this to provide the correct parameters
            snapshotHandle: bi://backing?backingImageDataSourceType=export-from-volume&backingImage=test-bi&volume-name=vol-export-src&export-type=qcow2
        volumeSnapshotRef:
            name: test-snapshot-on-demand-backing
            namespace: default
        ```
 Users then can create corresponding VolumeSnapshot and PVC
 ```yaml
 apiVersion: snapshot.storage.k8s.io/v1beta1
 kind: VolumeSnapshot
 metadata:
  name: test-snapshot-on-demand-backing
 spec:
  volumeSnapshotClassName: longhorn-snapshot-vsc
  source:
    # NOTE: change this to point to the prior VolumeSnapshotContent
    volumeSnapshotContentName: test-on-demand-backing
 ```
 ```yaml
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
  name: test-restore-on-demand-backing
 spec:
  storageClassName: longhorn
  dataSource:
    name: test-snapshot-on-demand-backing
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
 ```
 ### API changes
 No changes necessary
 ## Design
 ### Implementation Overview
 We add a new type `bi` to the parameter `type` in the VolumeSnapshotClass. It means that the CSI VolumeSnapshot created with this VolumeSnapshotClass is associated with a Longhorn BackingImage.
 #### CreateSnapshot function
 When the users create VolumeSnapshot and the volumeSnapshotClass `type` is `bi`
 ```yaml
 apiVersion: snapshot.storage.k8s.io/v1beta1
 kind: VolumeSnapshot
 metadata:
  name: test-snapshot-backing
 spec:
  volumeSnapshotClassName: longhorn-snapshot-vsc
  source:
    persistentVolumeClaimName: test-vol
 ```
 We do:
 - Get the name of the Volume
 - The name of the BackingImage will be same as the VolumeSnapshot `test-snapshot-backing`.
 - Check if a BackingImage with the same name as the requested VolumeSnapshot already exists. Return success without creating a new BackingImage.
 - Create a BackingImage.
    - Get `export-type` from VolumeSnapshotClass parameter `export-type`, default to `raw.`
    - Encode the `snapshotId` as `bi://backing?backingImageDataSourceType=export-from-volume&backingImage=test-snapshot-backing&volume-name=${VolumeName}&export-type=raw`
    - This `snaphotId` will be used in the later CSI CreateVolume and DeleteSnapshot call.
 #### CreateVolume function
 - If VolumeSource type is `VolumeContentSource_Snapshot`, decode the `snapshotId` to get the parameters.
    - `bi://backing?backingImageDataSourceType=${TYPE}&backingImage=${BACKINGIMAGE_NAME}&backingImageChecksum=${backingImageChecksum}&${OTHER_PARAMETES}`
 - If BackingImage with the given name already exists, create the volume.
 - If BackingImage with the given name does not exists, we prepare it first. There are 2 kinds of types which are `export-from-volume` and `download`.
    - For `download`, it means we have to prepare the BackingImage before creating the Volume. We first decode other parameters from `snapshotId` and create the BackingImage. 
    - For `export-from-volume`, it means we have to prepare the BackingImage before creating the Volume. We first decode other parameters from `snapshotId` and create the BackingImage.
 NOTE: we already have related code for preparing the BackingImage with type `download` or `export-from-volume` before creating a Volume, [here](https://github.com/longhorn/longhorn-manager/blob/master/csi/controller_server.go#L195)
 #### DeleteSnapshot function
 - Decode the `snapshotId` to get the name of the BackingImage. Then we delete the BackingImage directly.
 ### Test plan
 Integration test plan.
 #### Prerequisite
 1. Deploy the csi snapshot CRDs, Controller as instructed at
 https://longhorn.io/docs/1.4.1/snapshots-and-backups/csi-snapshot-support/enable-csi-snapshot-support/
 2. Create a VolumeSnapshotClass with type `bi`
    ```yaml
    # Use v1 as an example
    kind: VolumeSnapshotClass
    apiVersion: snapshot.storage.k8s.io/v1
    metadata:
      name: longhorn-snapshot-vsc
    driver: driver.longhorn.io
    deletionPolicy: Delete
    parameters:
      type: bi
    ```
 #### Scenerios 1: Create VolumeSnapshot from a Volume
 - Success
    1. Create a Volume `test-vol` of 5GB. Create PV/PVC for the Volume.
    2. Create a workload using the Volume. Write some data to the Volume.
    3. Create a VolumeSnapshot with following yaml:
        ```yaml
        apiVersion: snapshot.storage.k8s.io/v1beta1
        kind: VolumeSnapshot
        metadata:
          name: test-snapshot-backing
        spec:
          volumeSnapshotClassName: longhorn-snapshot-vsc
          source:
            persistentVolumeClaimName: test-vol
        ```
    4. Verify that BacingImage is created.
        - Verify the properties of BackingImage
            - `sourceType` is `export-from-volume`
            - `volume-name` is `test-vol`
            - `export-type` is `raw`
    5. Delete the VolumeSnapshot `test-snapshot-backing`
    6. Verify the BacingImage is deleted
 #### Scenerios 2: Create new Volume from CSI snapshot
 1. Create a Volume `test-vol` of 5GB. Create PV/PVC for the Volume.
 2. Create a workload using the Volume. Write some data to the Volume.
 3. Create a VolumeSnapshot with following yaml:
    ```yaml
    apiVersion: snapshot.storage.k8s.io/v1beta1
    kind: VolumeSnapshot
    metadata:
      name: test-snapshot-backing
    spec:
      volumeSnapshotClassName: longhorn-snapshot-vsc
      source:
        persistentVolumeClaimName: test-vol
    ```
 4. Verify that BacingImage is created.
 5. Create a new PVC with following yaml:
    ```yaml
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: test-restore-pvc
    spec:
      storageClassName: longhorn
      dataSource:
        name: test-snapshot-backing
        kind: VolumeSnapshot
        apiGroup: snapshot.storage.k8s.io
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
    ```
 5. Attach the PVC `test-restore-pvc` to a workload and verify the data
 6. Delete the PVC
 #### Scenerios 3: Restore pre-provisioned BackingImage
 1. Create a BackingImage `test-bi` using longhorn test raw image `https://longhorn-backing-image.s3-us-west-1.amazonaws.com/parrot.qcow2`
 2. Create a VolumeSnapshotContent with `snapshotHandle` pointing to BackingImage `test-bi` and provide the parameters.
    ```yaml
    apiVersion: snapshot.storage.k8s.io/v1
    kind: VolumeSnapshotContent
    metadata:
      name: test-existing-backing
    spec:
      volumeSnapshotClassName: longhorn-snapshot-vsc
      driver: driver.longhorn.io
      deletionPolicy: Delete
      source:
        snapshotHandle: bi://backing?backingImageDataSourceType=download&backingImage=test-bi&url=https%3A%2F%2Flonghorn-backing-image.s3-us-west-1.amazonaws.com%2Fparrot.qcow2&backingImageChecksum=bd79ab9e6d45abf4f3f0adf552a868074dd235c4698ce7258d521160e0ad79ffe555b94e7d4007add6e1a25f4526885eb25c53ce38f7d344dd4925b9f2cb5d3b
      volumeSnapshotRef:
        name: test-snapshot-existing-backing
        namespace: default
    ```
 3. Create a VolumeSnapshot associated with the VolumeSnapshotContent
    ```yaml
    apiVersion: snapshot.storage.k8s.io/v1beta1
    kind: VolumeSnapshot
    metadata:
      name: test-snapshot-existing-backing
    spec:
      volumeSnapshotClassName: longhorn-snapshot-vsc
      source:
        volumeSnapshotContentName: test-existing-backing
    ```
 4. Create a PVC with the following yaml
    ```yaml
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: test-restore-existing-backing
    spec:
      storageClassName: longhorn
      dataSource:
        name: test-snapshot-existing-backing
        kind: VolumeSnapshot
        apiGroup: snapshot.storage.k8s.io
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
    ```
 5. Attach the PVC `test-restore-existing-backing` to a workload and verify the data
 #### Scenerios 4: Restore on-demand provisioning BackingImage
 - Type `download`
    1. Create a VolumeSnapshotContent with `snapshotHandle` providing the required parameters and BackingImage name `test-bi`
        ```yaml
        apiVersion: snapshot.storage.k8s.io/v1
        kind: VolumeSnapshotContent
        metadata:
          name: test-on-demand-backing
        spec:
          volumeSnapshotClassName: longhorn-snapshot-vsc
          driver: driver.longhorn.io
          deletionPolicy: Delete
          source:
            snapshotHandle: bi://backing?backingImageDataSourceType=download&backingImage=test-bi&url=https%3A%2F%2Flonghorn-backing-image.s3-us-west-1.amazonaws.com%2Fparrot.qcow2&backingImageChecksum=bd79ab9e6d45abf4f3f0adf552a868074dd235c4698ce7258d521160e0ad79ffe555b94e7d4007add6e1a25f4526885eb25c53ce38f7d344dd4925b9f2cb5d3b
          volumeSnapshotRef:
            name: test-snapshot-on-demand-backing
            namespace: default
        ```
    2. Create a VolumeSnapshot associated with the VolumeSnapshotContent
        ```yaml
        apiVersion: snapshot.storage.k8s.io/v1beta1
        kind: VolumeSnapshot
        metadata:
          name: test-snapshot-on-demand-backing
        spec:
          volumeSnapshotClassName: longhorn-snapshot-vsc
          source:
            volumeSnapshotContentName: test-on-demand-backing
        ```
    3. Create a PVC with the following yaml
        ```yaml
        apiVersion: v1
        kind: PersistentVolumeClaim
        metadata:
          name: test-restore-on-demand-backing
        spec:
          storageClassName: longhorn
          dataSource:
            name: test-snapshot-on-demand-backing
            kind: VolumeSnapshot
            apiGroup: snapshot.storage.k8s.io
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 5Gi
        ```
    4. Verify BackingImage `test-bi` is created
    5. Attach the PVC `test-restore-on-demand-backing` to a workload and verify the data
 - Type `export-from-volume`
    - Success 
        1. Create a Volme `test-vol` and write some data to it.
        2. Create a VolumeSnapshotContent with `snapshotHandle` providing the required parameters and BackingImage name `test-bi`
            ```yaml
            apiVersion: snapshot.storage.k8s.io/v1
            kind: VolumeSnapshotContent
            metadata:
              name: test-on-demand-backing
            spec:
              volumeSnapshotClassName: longhorn-snapshot-vsc
              driver: driver.longhorn.io
              deletionPolicy: Delete
              source:
                snapshotHandle: bi://backing?backingImageDataSourceType=export-from-volume&backingImage=test-bi&volume-name=test-vol&export-type=qcow2
              volumeSnapshotRef:
                name: test-snapshot-on-demand-backing
                namespace: default
            ```
        2. Create a VolumeSnapshot associated with the VolumeSnapshotContent
            ```yaml
            apiVersion: snapshot.storage.k8s.io/v1beta1
            kind: VolumeSnapshot
            metadata:
              name: test-snapshot-on-demand-backing
            spec:
              volumeSnapshotClassName: longhorn-snapshot-vsc
              source:
                volumeSnapshotContentName: test-on-demand-backing
            ```
        3. Create a PVC with the following yaml
            ```yaml
            apiVersion: v1
            kind: PersistentVolumeClaim
            metadata:
              name: test-restore-on-demand-backing
            spec:
              storageClassName: longhorn
              dataSource:
                name: test-snapshot-on-demand-backing
                kind: VolumeSnapshot
                apiGroup: snapshot.storage.k8s.io
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 5Gi
            ```
        4. Verify BackingImage `test-bi` is created
        5. Attach the PVC `test-restore-on-demand-backing` to a workload and verify the data
 ### Upgrade strategy
 No upgrade strategy needed
 ## Note [optional]
 We need to update the docs and examples to reflect the new type of parameter `type` in the VolumeSnapshotClass.
--- a/enhancements/20230418-azure-blob-storage-backup-store-support.md
+++ b/enhancements/20230418-azure-blob-storage-backup-store-support.md
@ -0,0 +1,68 @@
 # Azure Blob Storage Backup Store Support
 ## Summary
 Longhorn supports Azure Blob Storage as a backup storage.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/1309
 ## Motivation
 ### Goals
 - Support Azure Blob Storage as a backup storage.
 ## Proposal
 - Introduce Azure Blob Storage client for supporting Azure Blob Storage as a backup storage.
 ## User Stories
 Longhorn already supports NFSv4, CIFS and S3 servers as backup storage. However, certain users may still want to be able to utilize Azure blob storage to push/pull backups to/from.
 ### User Experience In Details
 - Users can configure a Azure Blob Storage as a backup storage
  - Set **Backup Target**. The path to a Azure Blob Storage is like
    ```bash
    azblob://${container}@blob.core.windows.net/${path name}
    ```
 - Set **Backup Target Credential Secret**
  - Create a secret and deploy it
    ```yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: azblob-secret
      namespace: longhorn-system
    type: Opaque
    data:
      AZBLOB_ACCOUNT_NAME: ${AZBLOB_ACCOUNT_NAME}
      AZBLOB_ACCOUNT_KEY: ${AZBLOB_ACCOUNT_KEY}
    ```
  - Set the setting **Backup Target Credential Secret** to `azblob-secret`
 ## Design
 ### Implementation Overview
 - longhorn-manager
  - Introduce the fields `AZBLOB_ACCOUNT_NAME` and `AZBLOB_ACCOUNT_KEY` in credentials. The two fields are passed to engine and replica processes for volume backup and restore operations.
 - backupstore
  - Implement Azure Blob Storage register/unregister and basic CRUD functions.
 ## Test Plan
 ### Integration Tests
 1. Set a Azure Blob Storage as backup storage.
 2. Create volumes and write some data.
 3. Back up volumes to the backup storage and the operation should succeed.
 4. Restore backups and operations should succeed.
 5. All data is not corrupted.
--- a/enhancements/20230420-engine-identity-validation.md
+++ b/enhancements/20230420-engine-identity-validation.md
@ -0,0 +1,430 @@
 # Engine Identity Validation
 ## Summary
 Longhorn-manager communicates with longhorn-engine's gRPC ControllerService, ReplicaService, and SyncAgentService by
 sending requests to TCP/IP addresses kept up-to-date by its various controllers. Additionally, the longhorn-engine
 controller server sends requests to the longhorn-engine replica server's ReplicaService and SyncAgentService using
 TCP/IP addresses it keeps in memory. These addresses are relatively stable in normal operation. However, during periods
 of high process turnover (e.g. a node reboot or network event), it is possible for one longhorn-engine component to stop
 and another longhorn-engine component to start in its place using the same ports. If this happens quickly enough, other
 components with stale address lists attempting to execute requests against the old component may errantly execute
 requests against the new component. One harmful effect of this behavior that has been observed is the [expansion of an
 unintended longhorn-engine replica](https://github.com/longhorn/longhorn/issues/5709).
 This proposal intends to ensure all gRPC requests to longhorn-engine components are actually served by the intended
 component.
 ### Related Issues
 https://github.com/longhorn/longhorn/issues/5709
 ## Motivation
 ### Goals
 - Eliminate the potential for negative effects caused by a Longhorn component communicating with an incorrect
  longhorn-engine component.
 - Provide effective logging when incorrect communication occurs to aide in fixing TCP/IP address related race
  conditions.
 ### Non-goals
 - Fix race conditions within the Longhorn control plane that lead to attempts to communicate with an incorrect
  longhorn-engine component.
 - Refactor the in-memory data structures the longhorn-engine controller server uses to keep track of and initiate
  communication with replicas.
 ## Proposal
 Today, longhorn-manager knows the volume name and instance name of the process it is trying to communicate with, but it
 only uses the TCP/IP information of each process to initiate communication. Additionally, longhorn-engine components are
 mostly unaware of the volume name (in the case of longhorn-engine's replica server) and instance name (for both
 longhorn-engine controller and replica servers) they are associated with. If we provide this information to
 longhorn-engine processes when we start them and then have longhorn-manager provide it on every communication attempt,
 we can ensure no accidental communication occurs.
 1. Add additional flags to the longhorn-engine CLI that inform controller and replica servers of their associated volume
   and/or instance name.
 1. Use [gRPC client interceptors](https://github.com/grpc/grpc-go/blob/master/examples/features/interceptor/README.md)
   to automatically inject [gRPC metadata](https://github.com/grpc/grpc-go/blob/master/Documentation/grpc-metadata.md)
   (i.e. headers) containing volume and/or instance name information every time a gRPC request is made by a
   longhorn-engine client to a longhorn-engine server.
 1. Use [gRPC server interceptors](https://github.com/grpc/grpc-go/blob/master/examples/features/interceptor/README.md)
   to automatically validate the volume and/or instance name information in [gRPC
   metadata](https://github.com/grpc/grpc-go/blob/master/Documentation/grpc-metadata.md) (i.e. headers) every time a
   gRPC request made by a longhorn-engine client is received by a longhorn-engine server.
 1. Reject any request (with an appropriate error code) if the provided information does not match the information a
   controller or replica server was launched with.
 1. Log the rejection at the client and the server, making it easy to identify situations in which incorrect
   communication occurs.
 1. Modify instance-manager's `ProxyEngineService` (both server and client) so that longhorn-manager can provide the
   necessary information for gRPC metadata injection.
 1. Modify longhorn-manager so that is makes proper use of the new `ProxyEngineService` client and launches
   longhorn-engine controller and replica servers with additional flags.
 ### User Stories
 #### Story 1
 Before this proposal:
 As an administrator, after an intentional or unintentional node reboot, I notice one or more of my volumes is degraded
 and new or existing replicas aren't coming online. In some situations, the UI reports confusing information or one or
 more of my volumes might be unable to attach at all. Digging through logs, I see errors related to mismatched sizes, and
 at least one replica does appear to have a larger size reported in `volume.meta` than others. I don't know how to
 proceed.
 After this proposal:
 As an administrator, after an intentional or unintentional node reboot, my volumes work as expected. If I choose to dig
 through logs, I may see some messages about refused requests to incorrect components, but this doesn't seem to
 negatively affect anything.
 #### Story 2
 Before this proposal:
 As a developer, I am aware that it is possible for one Longhorn component to communicate with another, incorrect
 component, and that this communication can lead to unexpected replica expansion. I want to work to fix this behavior.
 However, when I look at a support bundle, it is very hard to catch this communication occurring. I have to trace TCP/IP
 addresses through logs, and if no negative effects are caused, I may never notice it.
 After this proposal:
 Any time one Longhorn component attempts to communicate with another, incorrect component, it is clearly represented in
 the logs.
 ### User Experience In Detail
 See the user stories above. This enhancement is intended to be largely transparent to the user. It should eliminate rare
 failures so that users can't run into them.
 ### API Changes
 #### Longhorn-Engine
 Increment the longhorn-engine CLIAPIVersion by one. Do not increment the longhorn-engine CLIAPIMinVersion. The changes
 in this LEP are backwards compatible. All gRPC metadata validation is by demand of the client. If a less sophisticated
 (not upgraded) client does not inject any metadata, the server performs no validation. If a less sophisticated (not
 upgraded) client only injects some metadata (e.g. `volume-name` but not `instance-name`), the server only validates the
 metadata provided.
 Add a global `volume-name` flag and a global `engine-instance-name` flag to the engine CLI (e.g. `longhorn -volume-name
 <volume-name> -engine-instance-name <engine-instance-name> <command> <args>`). Virtually all CLI commands create a
 controller client and these flags allow appropriate gRPC metadata to be injected into every client request. Requests
 that reach the wrong longhorn-engine controller server are rejected.
 Use the global `engine-instance-name` flag and the pre-existing `volume-name` positional argument to allow the
 longhorn-engine controller server to remember its volume and instance name (e.g. `longhorn -engine-instance-name
 <instance-name> controller <volume-name>`). Ignore the global `volume-name` flag, as it is redundant.
 Use the global `volume-name` flag or the pre-existing local `volume-name` flag and a new `replica-instance-name` flag to
 allow the longhorn-engine replica server to remember its volume and instance name (e.g. `longhorn -volume-name
 <volume-name> replica <directory> -replica-instance-name <replica-instance-name>`).
 Use the global `volume-name` flag and a new `replica-instance-name` flag to allow the longhorn-engine sync-agent server
 to remember its volume and instance name (e.g. `longhorn -volume-name <volume-name> sync-agent -replica-instance-name
 <replica-instance-name>`).
 Add an additional `replica-instance-name` flag to CLI commands that launch asynchronous tasks that communicate directly
 with the longhorn-engine replica server (e.g. `longhorn -volume-name <volume-name> add-replica <address> -size <size>
 -current-size <current-size> -replica-instance-name <replica-instance-name>`). All such commands create a replica
 client and these flags allow appropriate gRPC metadata to be injected into every client request. Requests that reach the
 wrong longhorn-engine replica server are rejected.
 Return 9 FAILED_PRECONDITION with an appropriate message when metadata validation fails. This code is chosen in
 accordance with the [RPC API](https://grpc.github.io/grpc/core/md_doc_statuscodes.html), which instructs developers to
 use FAILED_PRECONDITION if the client should not retry until the system system has been explicitly fixed.
 #### Longhorn-Instance-Manager
 Increment the longhorn-instance-manager InstanceManagerProxyAPIVersion by one. Do not increment the
 longhorn-instance-manager InstanceManagerProxyAPIMinVersion. The changes in this LEP are backwards compatible. No added
 fields are required and their omission is ignored. If a less sophisticated (not upgraded) client does not include them,
 no metadata is injected into engine or replica requests and no validation occurs (the behavior is the same as before the
 implementation of this LEP).
 Add `volume_name` and `instance_name` fields to the `ProxyEngineRequest` protocol buffer message. This message, which
 currently only contains an `address` field, is included in all `ProxyEngineService` RPCs. Updated clients can pass
 information about the engine process they expect to be communicating with in these fields. When instance-manager creates
 an asynchronous task to carry out the requested operation, the resulting controller client includes the gRPC interceptor
 described above.
 Add `replica_instance_name` fields to any `ProxyEngineService` RPC associated with an asynchronous task that
 communicates directly with a longhorn-engine replica server. When instance-manager creates the task, the resulting
 replica client includes the gRPC interceptor described above.
 Return 5 NOT FOUND with an appropriate message when metadata validation fails at a lower layer. (The particular return
 code is definitely open to discussion.)
 ## Design
 ### Implementation Overview
 #### Interceptors (longhorn-engine)
 Add a gRPC server interceptor to all `grpc.NewServer` calls.
 ```golang
 server := grpc.NewServer(withIdentityValidationInterceptor(volumeName, instanceName))
 ```
 Implement the interceptor so that it validates metadata with best effort.
 ```golang
 func withIdentityValidationInterceptor(volumeName, instanceName string) grpc.ServerOption {
 	return grpc.UnaryInterceptor(identityValidationInterceptor(volumeName, instanceName))
 }
 func identityValidationInterceptor(volumeName, instanceName string) grpc.UnaryServerInterceptor {
 	// Use a closure to remember the correct volumeName and/or instanceName.
 	return func(ctx context.Context, req any, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (any, error) {
 		md, ok := metadata.FromIncomingContext(ctx)
 		if ok {
 			incomingVolumeName, ok := md["volume-name"]
 			// Only refuse to serve if both client and server provide validation information.
 			if ok && volumeName != "" && incomingVolumeName[0] != volumeName {
 				return nil, status.Errorf(codes.InvalidArgument, "Incorrect volume name; check controller address")
 			}
 		}
 		if ok {
 			incomingInstanceName, ok := md["instance-name"]
 			// Only refuse to serve if both client and server provide validation information.
 			if ok && instanceName != "" && incomingInstanceName[0] != instanceName {
 				return nil, status.Errorf(codes.InvalidArgument, "Incorrect instance name; check controller address")
 			}
 		}
 		// Call the RPC's actual handler.
 		return handler(ctx, req)
 	}
 }
 ```
 Add a gRPC client interceptor to all `grpc.Dial` calls.
 ```golang
 connection, err := grpc.Dial(serviceUrl, grpc.WithInsecure(), withIdentityValidationInterceptor(volumeName, instanceName))
 ```
 Implement the interceptor so that it injects metadata with best effort.
 ```golang
 func withIdentityValidationInterceptor(volumeName, instanceName string) grpc.DialOption {
 	return grpc.WithUnaryInterceptor(identityValidationInterceptor(volumeName, instanceName))
 }
 func identityValidationInterceptor(volumeName, instanceName string) grpc.UnaryClientInterceptor {
 	// Use a closure to remember the correct volumeName and/or instanceName.
 	return func(ctx context.Context, method string, req any, reply any, cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error {
 		if volumeName != "" {
 			ctx = metadata.AppendToOutgoingContext(ctx, "volume-name", volumeName)
 		}
 		if instanceName != "" {
 			ctx = metadata.AppendToOutgoingContext(ctx, "instance-name", instanceName)
 		}
 		return invoker(ctx, method, req, reply, cc, opts...)
 	}
 }
 ```
 Modify all client constructors to include this additional information. Wherever these client packages are consumed (e.g.
 the replica client is consumed by the controller, both the replica and the controller clients are consumed by
 longhorn-manager), callers can inject this additional information into the constructor and get validation for free.
 ```golang
 func NewControllerClient(address, volumeName, instanceName string) (*ControllerClient, error) {
    // Implementation.
 }
 ```
 #### CLI Commands (longhorn-engine)
 Add additional flags to all longhorn-engine CLI commands depending on their function.
 E.g. command that launches a server:
 ```golang
 func ReplicaCmd() cli.Command {
 	return cli.Command{
 		Name:      "replica",
 		UsageText: "longhorn controller DIRECTORY SIZE",
 		Flags: []cli.Flag{
 			// Other flags.
            cli.StringFlag{
 				Name:  "volume-name",
 				Value: "",
 				Usage: "Name of the volume (for validation purposes)",
 			},
 			cli.StringFlag{
 				Name:  "replica-instance-name",
 				Value: "",
 				Usage: "Name of the instance (for validation purposes)",
 			},
 		},
        // Rest of implementation.
 	}
 }
 ```
 E.g. command that directly communicates with both a controller and replica server.
 ```golang
 func AddReplicaCmd() cli.Command {
 	return cli.Command{
 		Name:      "add-replica",
 		ShortName: "add",
 		Flags: []cli.Flag{
            // Other flags.
 			cli.StringFlag{
 				Name:     "volume-name",
 				Required: false,
 				Usage:    "Name of the volume (for validation purposes)",
 			},
 			cli.StringFlag{
 				Name:     "engine-instance-name",
 				Required: false,
 				Usage:    "Name of the controller instance (for validation purposes)",
 			},
 			cli.StringFlag{
 				Name:     "replica-instance-name",
 				Required: false,
 				Usage:    "Name of the replica instance (for validation purposes)",
 			},
 		},
 		// Rest of implementation.
 	}
 }
 ```
 #### Instance-Manager Integration
 Modify the ProxyEngineService server functions so that they can make correct use of the changes in longhorn-engine.
 Funnel information from the additional fields in the ProxyEngineRequest message and in appropriate ProxyEngineService
 RPCs into the longhorn-engine task and controller client constructors so it can be used for validation.
 ```protobuf
 message ProxyEngineRequest{
 	string address = 1;
 	string volume_name = 2;
 	string instance_name = 3;
 }
 ```
 Modify the ProxyEngineService client functions so that consumers can provide the information required to enable
 validation.
 #### Longhorn-Manager Integration
 Ensure the engine and replica controllers launch engine and replica processes with `-volume-name` and
 `-engine-instance-name` or `-replica-instance-name` flags so that these processes can validate identifying gRPC metadata
 coming from requests.
 Ensure the engine controller supplies correct information to the ProxyEngineService client functions so that identity
 validation can occur in the lower layers.
 #### Example Validation Flow
 This issue/LEP was inspired by [longhorn/longhorn#5709](https://github.com/longhorn/longhorn/issues/5709). In the
 situation described in this issue:
 1. An engine controller with out-of-date information (including a replica address the associated volume does not own)
   [issues a ReplicaAdd
   command](https://github.com/longhorn/longhorn-manager/blob/a7dd20cdbdb1a3cea4eb7490f14d94d2b0ef273a/controller/engine_controller.go#L1819)
   to instance-manager's EngineProxyService.
 2. Instance-manager creates a longhorn-engine task and [calls its AddReplica
   method](https://github.com/longhorn/longhorn-instance-manager/blob/0e0ec6dcff9c0a56a67d51e5691a1d4a4f397f4b/pkg/proxy/replica.go#L35).
 3. The task makes appropriate calls to a longhorn-engine controller and replica. The ReplicaService's [ExpandReplica
   command](https://github.com/longhorn/longhorn-engine/blob/1f57dd9a235c6022d82c5631782020e84da22643/pkg/sync/sync.go#L509)
   is used to expand the replica before a followup failure to actually add the replica to the controller's backend.
 After this improvement, the above scenario will be impossible:
 1. Both the engine and replica controllers will launch engine and replica processes with the `-volume-name` and
   `-engine-instance-name` or `replica-instance-name` flags.
 2. When the engine controller issues a ReplicaAdd command, it will do so using the expanded embedded
   `ProxyEngineRequest` message (with `volume_name` and `instance_name` fields) and an additional
   `replica_instance_name` field.
 3. Instance-manager will create a longhorn-engine task that automatically injects `volume-name` and `instance-name` gRPC
   metadata into each controller request.
 4. When the task issues an ExpandReplica command, it will do so using a client that automatically injects `volume-name`
   and `instance-name` gRPC metadata into it.
 5. If either the controller or the replica does not agree with the information provided, gRPC requests will fail
   immediately and there will be no change in any longhorn-engine component.
 ### Test plan
 #### TODO: Integration Test Plan
 In my test environment, I have experimented with:
 - Running new versions of all components, making gRPC calls to the longhorn-engine controller and replica processes with
  wrong gRPC metadata, and verifying that these calls fail.
 - Running new versions of all components, making gRPC calls to instance-manager with an incorrect volume-name or
  instance name, and verifying that these calls fail.
 - Running new versions of all components, adding additional logging to longhorn-engine and verifying that metadata
  validation is occurring during the normal volume lifecycle.
 This is really a better fit for a negative testing scenario (do something that would otherwise result in improper
 communication, then verify that communication fails), but we have already eliminated the only known recreate for
 [longhorn/longhorn#5709](https://github.com/longhorn/longhorn/issues/5709).
 #### Engine Integration Test Plan
 Rework test fixtures so that:
 - All controller and replica processes are created with the information needed for identity validation.
 - It is convenient to create controller and replica clients with the information needed for identity validation.
 - gRPC metadata is automatically injected into controller and replica client requests when clients have the necessary
  information.
 Do not modify the behavior of existing tests. Since these tests were using clients with identity validation information,
 no identity validation is performed.
 - Modify functions/fixtures that create engine/replica processes to allow the new flags to be passed, but do not pass
  them by default.
 - Modify engine/replica clients used by tests to allow for metadata injection, but do not enable it by default.
 Create new tests that:
 - Ensure validation fails when a directly created client attempts to communicate with a controller or replica server
  using the wrong identity validation information.
 - Ensure validation fails when an indirectly created client (by the engine) tries to communicate with a replica server
  using the wrong identity validation information.
 - Ensure validation fails when an indirectly created client (by a CLI command) tries to communicate with a controller or
  replica server using the wrong identity validation information.
 ### Upgrade strategy
 The user will get benefit from this behavior automatically, but only after they have upgraded all associated components
 to a supporting version (longhorn-manager, longhorn-engine, and CRITICALLY instance-manager).
 We will only provide volume name and instance name information to longhorn-engine controller and replica processes on a
 supported version (as governed by the `CLIAPIVersion`). Even if other components are upgraded, when they send gRPC
 metadata to non-upgraded processes, it will be ignored.
 We will only populate extra ProxyEngineService fields when longhorn-manager is running with an update ProxyEngineService
 client.
 - RPCs from an old client to a new ProxyEngineService server will succeed, but without the extra fields,
  instance-manager will have no useful gRPC metadata to inject into its longhorn-engine requests.
 - RPCs from a new client to an old ProxyEngineService will succeed, but instance-manager will ignore the new fields and
  not inject useful gRPC metadata into its longhorn-engine request.
 ## Note
 ### Why gRPC metadata?
 We initially looked at adding volume name and/or instance name fields to all longhorn-engine ReplicaService and
 ControllerService calls. However, this would be awkward with some of the existing RPCs. In addition, it doesn't make
 much intuitive sense. Why should we provide the name of an entity we are communicating with to that entity as part of
 its API? It makes more sense to think of this identity validation in terms of sessions or authorization/authentication.
 In HTTP, information of this nature is handled through the use of headers, and metadata is the gRPC equivalent.
 ### Why gRPC interceptors?
 We want to ensure the same behavior in every longhorn-engine ControllerService and ReplicaService call so that it is not
 up to an individual developer writing a new RPC to remember to validate gRPC metadata (and to relearn how it should be
 done). Interceptors work mostly transparently to ensure identity validation always occurs.
--- a/Show More
+++ b/Show More