-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Direct VMDK manipulation investigation #1
Comments
@hickeng, I had a discussion w/ the Lifecycle team regarding this, and received this response "you probably want to directly consume bits of disklib, ala diskTool. Writing useful state into the VMDK requires some tool that understands ext4. I vaguely recall the tools team having some code that inspects VMDK content invasively, but don't know the details" Further asking about ext4 seems to indicate that there's no ext4 partitioning available on ESX, however this sort of thing could be done from inside of a Linux VM (I'm sure this is not surprising). Further discussion with Mike Hall seems to indicate that we might be out of luck as far as doing this in an application running directly on ESX but a service VM (or maybe just a process running on the VCH) should be able to perform these actions in a way that's more efficient/performant than our current approach, through the use of loopback block devices. I tend to agree that performing these operations on the host directly would at the very least be considerably more challenging than performing them inside of a VM. |
We're currently performing the manipulation inside a VM, and that introduces either bottlenecks or significant throughput degradation; the latter could just require optimization, that's TBD. Presenting the disk as a true block device in the VM requires attach/detach operations, which are serial and expensive as they entail a fast-suspend-resume, and you are limited to 4 scsi controllers and 15 disks per controller. You could partially address the serial element with clever operation batching, but the latency isn't going to improve. If you mount the datastore directly (via 9p for example) then you bypass the need for attach/detach (see BON-26) however throughput was significantly lower. If the throughput performance could be improved and the security implications addressed then this latter approach becomes viable. |
The thought I had was could we use something disklib and create a VMDK on a normal filesystem on a guest. Then loopback mount it as a block device and use the normal linux tools to format and lay the archives on that filesystem. Then unmount and move that VMDK (file) to the datastore. |
@mlh78750 loopback to what? You create the vmdk in the hypervisor or in the guest? If in the guest, sure, but is there a driver for that which doesn't rely on FUSE? |
I was under the assumption we had no support to format ext4 and mount it on the host. If that is not the case and we can, then that is a better solution. If we cannot, then yes, I was thinking of something using FUSE to mount that image. |
@hickeng do you have a pointer to the bonneville 9p code? I recall there are bits that live in the daemon (but could be wrong) and bits that live in esx (but could be wrong). A pointer to both/either would be appreciated. |
This looks mostly like BON-147, BON-148, and BON-149. FYI in case we want to close those jira issues as this should probably subsume them. |
Parent issue: #38 |
Is this still relevant? @hickeng |
@gigawhitlocks still relevant, just not in plan for the near future - this requires an ESX agent and should be deferred until we have got the VMCI/vSocket communication work in place (that will supply the necessary basics and is higher priority). |
Otherwise we panic [ 74.127006 ] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200 [ 74.127006 ] [ 74.145909 ] CPU: 0 PID: 232 Comm: tether Tainted: G E 4.4.41-2.ph1-esx #1-photon [ 74.163249 ] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016 [ 74.185318 ] 0000000000000000 ffffffff8122091f ffffffff816f2418 ffff88007968bd08 [ 74.201909 ] ffffffff810c8ed6 0000000000000010 ffff88007968bd18 ffff88007968bcb0 [ 74.218554 ] ffff880073642948 0000000000000200 ffff88007bc9d010 0000000000000000 [ 74.235204 ] Call Trace: [ 74.240879 ] [<ffffffff8122091f>] ? dump_stack+0x5c/0x7d [ 74.252128 ] [<ffffffff810c8ed6>] ? panic+0xbf/0x1d7 [ 74.259385 ] [<ffffffff81045735>] ? do_exit+0xa35/0xa40 [ 74.264534 ] [<ffffffff8104579e>] ? do_group_exit+0x2e/0xa0 [ 74.269928 ] [<ffffffff8104e3f6>] ? get_signal+0x1b6/0x530 [ 74.275283 ] [<ffffffff810030ce>] ? do_signal+0x1e/0x5d0 [ 74.280619 ] [<ffffffff812a88a2>] ? tty_read+0x92/0xe0 [ 74.285646 ] [<ffffffff8111f1ae>] ? __vfs_read+0x1e/0xe0 [ 74.290838 ] [<ffffffff8142dc5f>] ? __schedule+0x38f/0x800 [ 74.296177 ] [<ffffffff8100104f>] ? exit_to_usermode_loop+0x7f/0xa0 [ 74.302314 ] [<ffffffff81001405>] ? syscall_return_slowpath+0x45/0x50 [ 74.309296 ] [<ffffffff814310cc>] ? int_ret_from_sys_call+0x25/0x8f [ 74.343746 ] Kernel Offset: disabled [ 74.595373 ] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200 [ 74.595373 ]] Fixes #4011
This commit adds an environment variable to specify the log aggregation endpoint ($LOG_AGGR_ADDR) to drone's step for running integration tests on pull requests.
Investigate how we can directly manipulate VMDKs on ESX. The following high level tasks are of specific interest:
The text was updated successfully, but these errors were encountered: