llvm.org GIT mirror llvm / a490b3e
[VFS] Allow multiple RealFileSystem instances with independent CWDs. Summary: Previously only one RealFileSystem instance was available, and its working directory is shared with the process. This doesn't work well for multithreaded programs that want to work with relative paths - the vfs::FileSystem is assumed to provide the working directory, but a thread cannot control this exclusively. The new vfs::createPhysicalFileSystem() factory copies the process's working directory initially, and then allows it to be independently modified. This implementation records the working directory path, and glues it to relative paths to provide the correct absolute path to the sys::fs:: functions. This will give different results in unusual situations (e.g. the CWD is moved). The main alternative is the use of openat(), fstatat(), etc to ask the OS to resolve paths relative to a directory handle which can be kept open. This is more robust. There are two reasons not to do this initially: 1. these functions are not available on all supported Unixes, and are somewhere between difficult and unavailable on Windows. So we need a path-based fallback anyway. 2. this would mean also adding support at the llvm::sys::fs level, which is a larger project. My clearest idea is an OS-specific `BaseDirectory` object that can be optionally passed to functions there. Eventually this could be backed by either paths or a fd where openat() is supported. This is a large project, and demonstrating here that a path-based fallback works is a useful prerequisite. There is some subtlety to the path-manipulation mechanism: - when setting the working directory, both Specified=makeAbsolute(path) and Resolved=realpath(path) are recorded. These may differ in the presence of symlinks. - getCurrentWorkingDirectory() and makeAbsolute() use Specified - this is similar to the behavior of $PWD and sys::path::current_path - IO operations like openFileForRead use Resolved. This is similar to the behavior of an openat() based implementation, that doesn't see changes in symlinks. There may still be combinations of operations and FS states that yield unhelpful behavior. This is hard to avoid with symlinks and FS abstractions :( The caching behavior of the current working directory is removed in this patch. getRealFileSystem() is now specified to link to the process CWD, so the caching is incorrect. The user who needed this so far is clangd, which will immediately switch to createPhysicalFileSystem(). Reviewers: ilya-biryukov, bkramer, labath Subscribers: ioeric, kadircet, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D56545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351050 91177308-0d34-0410-b5e6-96231b3b80d8 Sam McCall 1 year, 9 months ago
3 changed file(s) with 161 addition(s) and 30 deletion(s). Raw diff Collapse all Expand all
297297
298298 /// Gets an \p vfs::FileSystem for the 'real' file system, as seen by
299299 /// the operating system.
300 /// The working directory is linked to the process's working directory.
301 /// (This is usually thread-hostile).
300302 IntrusiveRefCntPtr getRealFileSystem();
303
304 /// Create an \p vfs::FileSystem for the 'real' file system, as seen by
305 /// the operating system.
306 /// It has its own working directory, independent of (but initially equal to)
307 /// that of the process.
308 std::unique_ptr createPhysicalFileSystem();
301309
302310 /// A file system that allows overlaying one \p AbstractFileSystem on top
303311 /// of another.
227227
228228 namespace {
229229
230 /// The file system according to your operating system.
230 /// A file system according to your operating system.
231 /// This may be linked to the process's working directory, or maintain its own.
232 ///
233 /// Currently, its own working directory is emulated by storing the path and
234 /// sending absolute paths to llvm::sys::fs:: functions.
235 /// A more principled approach would be to push this down a level, modelling
236 /// the working dir as an llvm::sys::fs::WorkingDir or similar.
237 /// This would enable the use of openat()-style functions on some platforms.
231238 class RealFileSystem : public FileSystem {
232239 public:
240 explicit RealFileSystem(bool LinkCWDToProcess) {
241 if (!LinkCWDToProcess) {
242 SmallString<128> PWD, RealPWD;
243 if (llvm::sys::fs::current_path(PWD))
244 return; // Awful, but nothing to do here.
245 if (auto Err = llvm::sys::fs::real_path(PWD, RealPWD))
246 WD = {PWD, PWD};
247 else
248 WD = {PWD, RealPWD};
249 }
250 }
251
233252 ErrorOr status(const Twine &Path) override;
234253 ErrorOr> openFileForRead(const Twine &Path) override;
235254 directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override;
241260 SmallVectorImpl &Output) const override;
242261
243262 private:
244 mutable std::mutex CWDMutex;
245 mutable std::string CWDCache;
263 // If this FS has its own working dir, use it to make Path absolute.
264 // The returned twine is safe to use as long as both Storage and Path live.
265 Twine adjustPath(const Twine &Path, SmallVectorImpl &Storage) const {
266 if (!WD)
267 return Path;
268 Path.toVector(Storage);
269 sys::fs::make_absolute(WD->Resolved, Storage);
270 return Storage;
271 }
272
273 struct WorkingDirectory {
274 // The current working directory, without symlinks resolved. (echo $PWD).
275 SmallString<128> Specified;
276 // The current working directory, with links resolved. (readlink .).
277 SmallString<128> Resolved;
278 };
279 Optional WD;
246280 };
247281
248282 } // namespace
249283
250284 ErrorOr RealFileSystem::status(const Twine &Path) {
285 SmallString<256> Storage;
251286 sys::fs::file_status RealStatus;
252 if (std::error_code EC = sys::fs::status(Path, RealStatus))
287 if (std::error_code EC =
288 sys::fs::status(adjustPath(Path, Storage), RealStatus))
253289 return EC;
254290 return Status::copyWithNewName(RealStatus, Path.str());
255291 }
257293 ErrorOr>
258294 RealFileSystem::openFileForRead(const Twine &Name) {
259295 int FD;
260 SmallString<256> RealName;
261 if (std::error_code EC =
262 sys::fs::openFileForRead(Name, FD, sys::fs::OF_None, &RealName))
296 SmallString<256> RealName, Storage;
297 if (std::error_code EC = sys::fs::openFileForRead(
298 adjustPath(Name, Storage), FD, sys::fs::OF_None, &RealName))
263299 return EC;
264300 return std::unique_ptr(new RealFile(FD, Name.str(), RealName.str()));
265301 }
266302
267303 llvm::ErrorOr RealFileSystem::getCurrentWorkingDirectory() const {
268 std::lock_guard Lock(CWDMutex);
269 if (!CWDCache.empty())
270 return CWDCache;
271 SmallString<256> Dir;
304 if (WD)
305 return WD->Specified.str();
306
307 SmallString<128> Dir;
272308 if (std::error_code EC = llvm::sys::fs::current_path(Dir))
273309 return EC;
274 CWDCache = Dir.str();
275 return CWDCache;
310 return Dir.str();
276311 }
277312
278313 std::error_code RealFileSystem::setCurrentWorkingDirectory(const Twine &Path) {
279 // FIXME: chdir is thread hostile; on the other hand, creating the same
280 // behavior as chdir is complex: chdir resolves the path once, thus
281 // guaranteeing that all subsequent relative path operations work
282 // on the same path the original chdir resulted in. This makes a
283 // difference for example on network filesystems, where symlinks might be
284 // switched during runtime of the tool. Fixing this depends on having a
285 // file system abstraction that allows openat() style interactions.
286 if (auto EC = llvm::sys::fs::set_current_path(Path))
287 return EC;
288
289 // Invalidate cache.
290 std::lock_guard Lock(CWDMutex);
291 CWDCache.clear();
314 if (!WD)
315 return llvm::sys::fs::set_current_path(Path);
316
317 SmallString<128> Absolute, Resolved, Storage;
318 adjustPath(Path, Storage).toVector(Absolute);
319 bool IsDir;
320 if (auto Err = llvm::sys::fs::is_directory(Absolute, IsDir))
321 return Err;
322 if (!IsDir)
323 return std::make_error_code(std::errc::not_a_directory);
324 if (auto Err = llvm::sys::fs::real_path(Absolute, Resolved))
325 return Err;
326 WD = {Absolute, Resolved};
292327 return std::error_code();
293328 }
294329
295330 std::error_code RealFileSystem::isLocal(const Twine &Path, bool &Result) {
296 return llvm::sys::fs::is_local(Path, Result);
331 SmallString<256> Storage;
332 return llvm::sys::fs::is_local(adjustPath(Path, Storage), Result);
297333 }
298334
299335 std::error_code
300336 RealFileSystem::getRealPath(const Twine &Path,
301337 SmallVectorImpl &Output) const {
302 return llvm::sys::fs::real_path(Path, Output);
338 SmallString<256> Storage;
339 return llvm::sys::fs::real_path(adjustPath(Path, Storage), Output);
303340 }
304341
305342 IntrusiveRefCntPtr vfs::getRealFileSystem() {
306 static IntrusiveRefCntPtr FS = new RealFileSystem();
343 static IntrusiveRefCntPtr FS(new RealFileSystem(true));
307344 return FS;
345 }
346
347 std::unique_ptr vfs::createPhysicalFileSystem() {
348 return llvm::make_unique(false);
308349 }
309350
310351 namespace {
332373
333374 directory_iterator RealFileSystem::dir_begin(const Twine &Dir,
334375 std::error_code &EC) {
335 return directory_iterator(std::make_shared(Dir, EC));
376 SmallString<128> Storage;
377 return directory_iterator(
378 std::make_shared(adjustPath(Dir, Storage), EC));
336379 }
337380
338381 //===-----------------------------------------------------------------------===/
381381 }
382382 operator StringRef() { return Path.str(); }
383383 };
384
385 struct ScopedFile {
386 SmallString<128> Path;
387 ScopedFile(const Twine &Path, StringRef Contents) {
388 Path.toVector(this->Path);
389 std::error_code EC;
390 raw_fd_ostream OS(this->Path, EC);
391 EXPECT_FALSE(EC);
392 OS << Contents;
393 OS.flush();
394 EXPECT_FALSE(OS.error());
395 if (EC || OS.error())
396 this->Path = "";
397 }
398 ~ScopedFile() {
399 if (Path != "")
400 EXPECT_FALSE(llvm::sys::fs::remove(Path.str()));
401 }
402 };
384403 } // end anonymous namespace
385404
386405 TEST(VirtualFileSystemTest, BasicRealFSIteration) {
408427 EXPECT_TRUE(I->path().endswith("a") || I->path().endswith("c"));
409428 I.increment(EC);
410429 EXPECT_EQ(vfs::directory_iterator(), I);
430 }
431
432 TEST(VirtualFileSystemTest, MultipleWorkingDirs) {
433 // Our root contains a/aa, b/bb, c, where c is a link to a/.
434 // Run tests both in root/b/ and root/c/ (to test "normal" and symlink dirs).
435 // Interleave operations to show the working directories are independent.
436 ScopedDir Root("r", true), ADir(Root.Path + "/a"), BDir(Root.Path + "/b");
437 ScopedLink C(ADir.Path, Root.Path + "/c");
438 ScopedFile AA(ADir.Path + "/aa", "aaaa"), BB(BDir.Path + "/bb", "bbbb");
439 std::unique_ptr BFS = vfs::createPhysicalFileSystem(),
440 CFS = vfs::createPhysicalFileSystem();
441
442 ASSERT_FALSE(BFS->setCurrentWorkingDirectory(BDir.Path));
443 ASSERT_FALSE(CFS->setCurrentWorkingDirectory(C.Path));
444 EXPECT_EQ(BDir.Path, *BFS->getCurrentWorkingDirectory());
445 EXPECT_EQ(C.Path, *CFS->getCurrentWorkingDirectory());
446
447 // openFileForRead(), indirectly.
448 auto BBuf = BFS->getBufferForFile("bb");
449 ASSERT_TRUE(BBuf);
450 EXPECT_EQ("bbbb", (*BBuf)->getBuffer());
451
452 auto ABuf = CFS->getBufferForFile("aa");
453 ASSERT_TRUE(ABuf);
454 EXPECT_EQ("aaaa", (*ABuf)->getBuffer());
455
456 // status()
457 auto BStat = BFS->status("bb");
458 ASSERT_TRUE(BStat);
459 EXPECT_EQ("bb", BStat->getName());
460
461 auto AStat = CFS->status("aa");
462 ASSERT_TRUE(AStat);
463 EXPECT_EQ("aa", AStat->getName()); // unresolved name
464
465 // getRealPath()
466 SmallString<128> BPath;
467 ASSERT_FALSE(BFS->getRealPath("bb", BPath));
468 EXPECT_EQ(BB.Path, BPath);
469
470 SmallString<128> APath;
471 ASSERT_FALSE(CFS->getRealPath("aa", APath));
472 EXPECT_EQ(AA.Path, APath); // Reports resolved name.
473
474 // dir_begin
475 std::error_code EC;
476 auto BIt = BFS->dir_begin(".", EC);
477 ASSERT_FALSE(EC);
478 ASSERT_NE(BIt, vfs::directory_iterator());
479 EXPECT_EQ((BDir.Path + "/./bb").str(), BIt->path());
480 BIt.increment(EC);
481 ASSERT_FALSE(EC);
482 ASSERT_EQ(BIt, vfs::directory_iterator());
483
484 auto CIt = CFS->dir_begin(".", EC);
485 ASSERT_FALSE(EC);
486 ASSERT_NE(CIt, vfs::directory_iterator());
487 EXPECT_EQ((ADir.Path + "/./aa").str(), CIt->path()); // Partly resolved name!
488 CIt.increment(EC); // Because likely to read through this path.
489 ASSERT_FALSE(EC);
490 ASSERT_EQ(CIt, vfs::directory_iterator());
411491 }
412492
413493 #ifdef LLVM_ON_UNIX