How WSL accesses Linux files from Windows

How WSL accesses Linux files from Windows


>>Hi, hello. Welcome to another deep dive into the Windows Subsistence
Flex Architecture. My name is Craig Loewen. I’m
a Program Manager working on WSL, and I’m joined today with.>>Sven Groot, Dev working on WSL>>Today, we’re going to
be looking at exactly how the newest update of WSL, allows you to access Linux
files through Windows. So luckily, Sven is the Developer, who has made this feature possible. We’re going to jump
into that. Before we actually get into how that works, how do we access Windows files again? How does that whole process work?>>All right, let’s first
take a quick look at how the file system
stuff and WSL work. We’ve talked about this
before and draw on-premises exact diagram for it
and let me do it again. So if we just make our
[inaudible] up here. My handwriting is still terrible. [inaudible] So if you launch WSL you see, [inaudible] access the manager, servers create a WSL [inaudible]. That will launch some Binaries. And then binary and that will then launch Bash and whatever
else who are running. These are Fical processes
as part of the subsystem. They their System Calls
don’t go to normal windows. They go to our [inaudible] driver
[inaudible] out the core. We have a layer in here and
download the System Calls.>>All of these are routed
down to the NT kernel.>>Yes.>>Right.>>A lot of stuff we have to get inside of those four because
a lot of differences between. If it wasn’t easy, this system
called ‘mouse’ [inaudible] to this system called Windows then that would be
a lot less work to do. Especially on the file system side; there’s a lot of stuff we need
to emulate inside of core.>>Right.>>[inaudible] something like
that when the System calls. They gets routed down to the
[inaudible] layer inside all score. That’s the virtual file system that brings all the file systems together, exactly the same as
the Via-translator in Real X. Basically, that uses different; we call them “Balance” and “Plugins”, kind of analogous to the “files”
and “drivers” usually reviewed. Traditionally, we have WSLFS formula [inaudible] which is
the Internal File System.>>Just be to totally clear WSLFS stands for the Windows Subsystem
for Linux File System Right.>>Yes.>>Okay, good thing we’re sure.>>That’s your Internal Cloud System. if have [inaudible] , this is slightly updated
version of that.>>Right.>>The same thing. Everything
you have lived in this “System Files” or “Home Directory” everything that’s inside of the WSL. We have drive S that’s is what
accesses your Windows Files. And then there’s stuff like. [inaudible] For example, [inaudible] necessarily represent
Physical Files on disk. So, let’s finish this box. Underneath this all
is the [inaudible]>>Great.>>And that has NTFS obviously.>>Yeah.>>So both the WSLFS the internal
File System and Drive have access files that are stored
on your real NTFS disk.>>Great.>>Drive S calls [inaudible]
and whatever but that’s not really relevant
for this discussion. So even if your Internal List Files are stored on this,
you can find them. There would be in the store
apps directory somewhere…>>Right, somewhere in app data.>>Yeah. But we basically always
says like “don’t touch that”. if you touch that, things are going to go wrong.>>And to be clear that
rule still exists. It means don’t go there.>>[inaudible] That does not change. There is a few reasons for that. One of them is that we store
Metadata in these files, Linux house like
the permissions, mode beds. Some extra file types like FIFOs
and Sockets and Device Files, and all that stuffs that Windows doesn’t really have so
it adds the Metadata to these files that represents
these extra effects.>>Right.>>Unfortunately, a
lot of Text Editors. When you save a file, they will strip that out.>>Yes.>>So that will break those files. [inaudible] The other
reason is that for performance reasons on the like the internal side in VFS and NWSFL. We just Caching. As you start Caching these files while
the distribution is running, casual get out and
sync because we don’t look for changes that
happen outside WSF we assume that we are the only ones
that are not just [inaudible].>>Great.>>That also doesn’t really work. People have always
told us like, “Hey, we want to be able
to edit these files. We want to be able to add an
[inaudible] or ‘Receive file’. Maybe not know ‘
[inaudible] ” Actually, note that [inaudible] so
actually is possible. People have wanted to do that and it’s been one of the top feature
of us for a very long time. So we wanted to find a way to
make that possible and basically, in order to make sure that all
of our semantics are preserved, all of our metadata, works [inaudible] and also
other stuff like I notify events.>>Rights.>>The driving fast, we use
mechanisms to translate Windows File System
Notifications to Linux. I notify against before
the results that’s because again we assume that nobody’s
making changes on the Windows site. We don’t do that and which let’s us be more
accurate because there’s not a good one to one mapping between the limits
and the limits of that.>>Right, exactly.>>In order, to make sure that all of those semantics still work correctly. Basically, that means that when we access these files from Windows, we want to go through this stack. We want to make sure that
everything we need to do all these things make sense
on a landslide happens. So and essentially, the way we
ended up doing that is like kind of treating it like
a Network File Systems even though it’s like
on the same machine.>>Okay.>>And that we’re
doing that by running a [inaudible] we should
make this one bigger we’re running a Plan 9 Server inside of it.>>Okay, what does a Plan 9 Server exactly entail? What does that mean?>>So I have noticed there
have been some confusion about this name.>>Right. Plan 9 is like an old
Operating System I believe. What we’re actually referring
to when we say this is a File System Protocol
that was part of that Operating System that’s called 9P under the exact version
we are using is 9P2000.L>>Got it.>>Which is basically a version of the byline process and protocol with some Linux
specific extensions.>>Okay. So this is a server
running a specific protocol, which is an IMP protocol.>>Yeah. We just further this
with a byline for simplicity. So even that’s confusing.>>Okay.>>The reason we went with this, not like something
like SMB or whatever. There’s a few reasons for SMB, why it could’ve been nice
to use that since obviously the Windows site or SMB or whatever. The problem is that inside of Linux, well, there’s Samba obviously, and Samba works on domiciles. So sounds great. The problem is Samba may or may
not be in your Linux distribution.>>Right.>>It may not be installed by
default to reconfigure it. What if you wanted to, or already have
a running Samba instance inside [inaudible]
using for something.>>Yes.>>Do we overwrite
that configuration, do we try to coexist with it? Do we have multiple Samba daemons
running or something? They can run it with
that kind of stuff. So if you’re dealing with the distribution that
doesn’t have Samba, we can’t ship it unfortunately
because its GPL, and Microsoft can’t really ship
GPL code as part of Windows. So let’s remember reasons why
that wasn’t really a great idea.>>So that’s how we settled
on using the [inaudible].>>Yeah. Then we could have
written our own Linux SMB Server, but SMB is a much more
complicated protocol than 9P is.>>Okay.>>It just so happened that we
already have a Plan 9 server.>>Just lying around.>>Yes. This was
a Windows Plan 9 server. It’s actually used for
Linux containers on Windows to access Windows files.>>Okay, I see.>>Because the Linux kernel
has a Plan 9 client built in.>>Right.>>So we basically ordered
that server to Linux and use it as our file systems server.>>Got it.>>Inside of itself.>>Very cool. So it
exists here inside of.>>Yeah. So it just runs
in on it in user mode. It talks to [inaudible] same
as everything else does.>>Right. This is running
inside of the distribution.>>Yeah.>>Very easy. Okay.>>Exactly. It’s running
in it so you calculate.>>Right. Okay.>>So how does this work
from the Windows side?>>So it talks to others
core to access these files.>>Yes. It just acts like
any other Linux program running it. It doesn’t have any special
privileges or whatever.>>Okay.>>It does everything
completely normal.>>Yeah. So when I
have a Windows process that’s not trying to
access some file in here, that’s how it work, the Window side.>>Yeah. So the Windows
side of things. Let’s make sure they’re same print. So when a side of things, you have a normal process like
the CMD, or Explorer, or whatever, and you’re
trying to access a path, which we’ve decided we’ll
start with W cell dollar.>>Okay.>>Now, the native to
your distribution, let’s say [inaudible]
for convenience.>>Okay.>>So when you issue or create for a file which starts
with the little backslash, which is a U and C path.>>Okay.>>Which normally a network path, you can nearly put it
in here and end there.>>Yes, exactly.>>What actually happens there is requests of those kinds
of file names. They got 72 a driver in Windows, it’s called MAC, the
multiple UNC provider.>>Okay.>>So a nut’s job is to figure out who should
hang all this request.>>Right.>>It doesn’t help bring it by having a list of what is
known as re-directors. There’s a couple that
come with Windows. The most well-known
among obviously is SMB.>>Okay.>>So then now there is a new one
which is the line re-director.>>Okay. Is that
something we’ve created?>>Yes. That’s the one we’ve
just recreated for this. Then there is some others
like there’s [inaudible].>>Yeah.>>This is if you connect to
the machine via remote desktop, and you’re sharing the host drives, you can access the [inaudible]
via [inaudible] client, but works very much the same way
as occurred at random [inaudible]>>Okay.>>There could be others
like a NFS client Windows uses as a re-director
of Web tab as a re-director.>>Right.>>There’s so many different phase.>>Why did you choose
to using backslash, backslash, W cell dollar.>>Yeah. So basically, goes through at least
in priority order, and this is actually
the standard priority order that they happened to be in, if you’re not in New Install. For example, with the TS plan
example I just gave you, if you are connected
to a remote desktop and you’re sharing your host drives, and you happen to have
a machine on your network, those calls TS client, that machine is now unreachable. Because MAC will give
this path to RDPDR first, and that says, okay, I know how to handle
this and then MAC stops. The same thing happens with
us. We get your request for, we basically get any requests
for any network path, but we just look at this, the server
name of this W cell dollar. If not, then we pass it along, if not, if it is, then we handle it. The reason we picked W cell
dollar is because we didn’t want to break here if you had a computer on your network
that’s called W cell.>>Okay.>>The dollar is not a valid
character for a machine name.>>Right. Of course.>>So if you’re wondering
like we could have put P9 audio behind SMB
and chose a priority, then you have the opposite problem. When you have a machine
called W cell, now you can’t breach
W cell files anymore.>>Right.>>That also would mean that we
always would have to wait for us and lead a timeout while
we get the request.>>Okay.>>We’re just not desirable. So the Plan 9 re-director, it’s basically a normal
re-director in terms of Windows, so they could ask like
a network file system. It implements a client
for the Plan 9 protocol. But it has some extra knowledge
in how to talk to WSO. Basically, whenever
WSO instances started, the Plan 9 server creates a Unix Sockets somewhere
on your C drive. We’ve had for few releases, I think, the ability to interop between
Windows and Linux via Unix Sockets. We added Unix sockets to Windows
and we expose the ability to use that to communicate between Windows processes
and other cell processes. So that was what we’re doing here. When the lines are restart, it creates that Unix socket
and listens on it, and LSS’ Manager will talk to
redirect her and telling it, hey, there’s a distribution
called Ubuntu, and this is its Unix socket path. So when we actually receive
a request for a network path. We first say, is the server
name that we sell dollar, if so we can handle this. The next we see was what would normally be the ShareName for
us it is the distribution name. Then so we look at that. Is this in our list of known
[inaudible] running distributions? If it is, then we establish a connection to
that unit socket and then we start exchanging 9P2000 messages
with the server. Just like any network protocol
from that point on.>>Then that connection is
done through our LXSS Manager.>>Yeah, well the connection itself doesn’t go through LXSS Manager. There’s just [inaudible]
socket somewhere.>>A socket, that is where
the communication is going through.>>Communication goes
straight like that. LXSS manager’s only job here
is to tell the plan and re-director of whether
to connect to and when the instance shuts down we
will also show up like “Hey, you can not connect this any more”.>>Right. Okay, awesome. So what happens if
the district isn’t currently running when we try
to access this file.>>Yeah. So that’s something
we have to consider. Originally, we were thinking
we’ve got something we might do for the next release,
not that officially. Because if you have a file open, like if LXSS Manager wants to shut down this
distribution because there is no open dash Windows and I think of the 15 second time out once the
last [inaudible] goes away. Then it wants to shut that down. Now we check if the Plan 9 server has any open files and if it does then
we won’t shut down the Access. So that’ll make sure that
if something is open, we don’t just invalidate
your handle and stop breaks. The problem is in a lot
of text editors like the Studio Code or
Notepad++ or whatever. I think notepad actually
does keep files open. But most text-file editors
don’t keep the file open. They open it, read the contents
and close it and they’re done. Then if you modify a file and
you have saved at that point, it will reopen the file and
write the new context, whatever. So we thought it would
be pretty bad if half way in that process
your instance shuts down and then you get save
and you get an error message.>>Yes. That would be bad.>>So I mean we could
tell people like, “Hey mom, don’t close your back
windows while you’re using this.” But that’s not really
a nice user experience.>>Of course.>>There’s even situations
where windows update besides to restart
your machine and reopens, so you will see here after
the update and then it has your file open and
it’s trying to reload it and now in the instances
in running things like that. So in order to support that. What we did is, we had to add an extra
service servers because of some context that
isn’t present normally, because this servers is
running as local system, and normally [inaudible]
it makes a calm called to the service and then
service can impersonate you and start it in
distribution on your behalf. When we get to Creates the actual the sturdy token that is to be used for that Create is stored inside the Create information, which may or may not be the actual token that thread is running on it under
that we got to Create from. Specially if the fills
are got involved, then like spot up a worker thread
to do something with this files. If we give that token
information to LXSS Manager, it wouldn’t be able
to impersonate you. So it can’t start a
distribution on their behalf. So we have to have something that’s actually already
running as you, as your user account. Which is a new service which is called originally LXSS Manager
user [inaudible]. This LXSS Manager user, that is a per user service. If you look in your list
of running services, you will see it’s called LXSS Manager user underscore some weird number. That’s because every logged on user gets an instance of this service. That runs under
your own user account. We actually has a longer-term goal, want to move more of
the functionality from our LXSS Manager
to this service. Because a lot of the code in our LXSS Manager doesn’t strictly
need to run as local systems, so we can reduce the privileges
it has that would be great. So what happens, if
we get a Create for Ubuntu and the re-director
doesn’t have Ubuntu on it’s list of known distributions
and how to connect to them. So when the LXSS Manager user
starts it actually sends an ioctal to this is
getting messy, whatever. It sends an ioctal through
the re-director which we keep in a pending stage and as soon as the Create comes in for
distributioning that we do not know. We complete that ioctal with the name of that distribution
in the output buffer. LXSS Manager user talks to
the regular LXSS Manager to see if there’s distribution
with that name and if it can start it. Sends [inaudible] then LXSS Manager can impersonate you and start
the distribution and everything.>>Right.>>We wait for that
distribution to come up at which point LXSS Manager informs the re-director
where to connect to this. Then once that’s done, the user service pens a new ioctal
both tell it “Hey, I am done.” This can be used. The next time
you need to do this.>>So all the mechanism
for starting it out.>>Yeah.>>Host in there.>>Yeah.>>Right.>>Once re-directors
sees that reply and basically retries
all the Creates that we’re waiting on it to
see if okay do I now know where this distribution
file auto connect to it.>>Very cool.>>So that summarizes
the story of how we can access through windows,
when it starts.>>Exactly.>>Awesome. Now that’s
all the questions we had. So if you have anymore. Please feel free to
reach us out on Twitter. You can reach me I’m @craigaloewen. Sven, you’re also on Twitter?>>Yes. It’s a svengroot_ms. I am not very active on Twitter. But if I see somebody ask a question. I’ll try to res-