-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Description
Zero-config gopls workspaces
This issue describes a change to gopls' internal data model that will allow it to "Do The Right Thing" when the user opens a Go file. By decoupling the relationship between builds and workspace folders, we can eliminate complexity related to configuring the workspace (hence "zero-config"), and lay the groundwork for later improvements such as better support for working on multiple sets of build tags simultaneously (#29202).
After this change, users can work on multiple modules inside of a workspace regardless of whether they are related by a go.work file or explicitly open as separate workspace folders.
Background
Right now, gopls determines a unique build (called a View) for each workspace folder. When a workspace folder is opened, gopls performs the following steps:
- Request configuration for each workspace folder using the
workspace/configurationrequest withscopeUriset to the folder URI. - Using this configuration (which may affect the Go environment), resolve a root directory for this folder:
a. Checkgo env GOWORK.
b. Else, look forgo.modin a parent directory (recursively).
c. Else, look forgo.modin a nested directory, if there is only one such nested directory. This was done to support polyglot workspaces where the Go project is in a nested directory, but is a source of both confusion and unpredictable startup time.
d. Else, use the folder as root. - Load package metadata for the workspace by calling
go/packages.Load. - Type check packages. We "fully" type-check packages that are inside a workspace module, and attempt to type-check only the exported symbols of packages in dependencies outside the workspace.
Problems
There are several problems with this model:
- gopls startup involves scanning the entire workspace directory to find modules. If a user opens a home directory with millions of files, we pay significant a startup penalty (x/tools/gopls: very slow startup without go.mod or go.work #56496).
- The layout of gopls' internal data model depends on which directories are opened. Users must understand which directory to open, and gopls has to try to provide useful error messages when the workspace is misconfigured. This has been a significant source of confusion, and has led to various workarounds such as the
"experimentalWorkspaceModule"setting,"expandWorkspaceToModule"setting, and"directoryFilters"setting. - Confusingly, if there is only one module in a nested directory, gopls will work. But if there are two modules, gopls won’t work.
- If the user opens a file in a module that is not included in the workspace, gopls will simply not work, even though the go command may function properly when run from the file’s directory (as in the case where there are two nested modules but no go.work file).
- gopls does a lot of work eagerly when initialized, before the user opens a file or makes any request. This may not be desirable, particularly in polyglot workspaces.
- ad-hoc packages (packages outside of GOPATH, with no go.mod) do not work well with gopls. They have limited support if the ad-hoc package directory is opened as a workspace folder, but have several bugs and don’t work as expected when multiple ad-hoc directories are present.
New Model
We can address these problems by decoupling Views from workspace folders. The set of views will be dynamic, depending on both the set of open folders and the set of open files, and will be chosen to cover all open files.
Specifically, define new View and Folder types approximately as follows:
type Session struct {
views []*View
folders []*Folder
// other per-session fields
}
type View struct {
viewType ViewType // workspace (go.work), module (go.mod), GOPATH, or adhoc
source URI // go.work file, go.mod file, or directory
modules []URI // set of modules contained in this View, if any
options *Options // options derived from either session options, or folder options
// …per-view state, such as the latest snapshot
}
type ViewType int
const (
workspace ViewType = iota // go.work
module ViewType // go.mod
gopath ViewType // GOPATH directory
adhoc ViewType // ad-hoc directory – see below
)
type Folder struct {
dir URI // workspace folder
options *Options // configuration scoped to the workspace folder
}A Session consists of a set of View objects describing modules (go.mod files), workspaces (go.work files), GOPATH directories or ad-hoc packages that the user is working on. This set is determined by both the workspace folders specified by the editor and the set of open files.
View types
- A
workspaceView is defined by ago.workfile.sourceis the path to thego.workfile. - A
moduleView is defined by a singlego.modfile.sourceis the path to thego.modfile. - A
GOPATHView is defined by a folder inside aGOPATHdirectory, withGO111MODULE=offorGO111MODULE=autoand nogo.modfile.sourceis the path to the directory. - An
adhocView is defined by a folder outside ofGOPATH, with no enclosinggo.modfile. In this case, we consider files in the same directory to be part of a package, andsourceis the path to the directory.
The set of Views
We define the set of Views to ensure that we have coverage for each open folder, and each open file.
- For each workspace folder, determine a View using the the following algorithm:
- If
go env GOWORKis set, create aworkspaceView. - Else, look for
go.modin a parent directory. If found, create amoduleView. - Else, if the workspace folder is inside
GOPATH, andGO111MODULEis not explicitly set toon, create aGOPATHView. IfGO111MODULE=onexplicitly, fail. - Else, create an
adhocView for the workspace folder. This may not be desirable for the user if they have modules contained in nested directories. In this case we could either prompt the user, or scan for modules in nested directories, creating Views for each (but notably if we do decide to scan the filesystem, we would create a View for each go.mod or go.work file encountered, rather than fail if there are more than one).
- For each open file, apply the following algorithm:
Match to an existing View
- Find the enclosing module corresponding to the file by searching parent directories for
go.modfiles. - If a
go.modfile is found, search for existingworkspaceormoduletype Views containing this module in theirmodulesset. - Search for existing
GOPATHtype Views whosesourcedirectory contains the file. - Search for existing
adhoctype Views whosesourceis equal tofilepath.Dir(file).
If no existing View matches the file, create a new one
- Find a workspace folder containing the file, if any. If none is found, use a nil Folder (and therefore assume the default configuration). Note that if a workspace folder is found, the file is either in a module that is not included in the go.work file, or the folder is ad-hoc.
- If the file is in a module, define a new View of
moduletype. Apply an explicitGOWORK=offto the View configuration to ensure that we can load the module. - If the file is not in a module, define a new ad-hoc View.
Initializing views
Initialize views using the following logic. This essentially matches gopls’ current behavior.
- For
workspaceViews, loadmodulepath/...for each workspace module. - For
moduleViews, loadmodulepath/...for the main module. - For
GOPATHViews, load./...from the View dir. - For
ad-hocViews, load./from the View dir.
Type-check packages (and report their compiler diagnostics) as follows:
- For
workspaceViews, type-check any package whose module is a workspace module. - For
moduleViews, type-check any package whose module is the main module. - For
GOPATHViews, type-check any package contained indir. - For
adhocViews, type-check the ad-hoc package.
Resolving requests to Views
When a file-oriented request is handled by gopls (a request prefixed with textDocument/, such as textDocument/definition), gopls must usually resolve package metadata associated with the file.
In most cases, gopls currently chooses an existing view that best applies to the file (cache.bestViewForURI), but this is already problematic, because it can lead to path-dependency and incomplete results (c.f. #57558). For example: when finding references from a package imported from multiple views, gopls currently only shows references in one view.
Wherever possible, gopls should multiplex queries across all Views and merge their results. This would lead to consistent behavior of cross references. In a future where gopls has better build-tag support, this could also lead to multiple locations for jump-to-definition results.
In some cases (for example hover or signatureHelp), we must pick one view. In these cases we can apply some heuristic, but it should be of secondary significance (any hover or signatureHelp result is better than none).
Updating Views
Based on the algorithms used to determine Views above, the following notifications may affect the set of Views:
didOpenanddidClosecause gopls to re-evaluate Views, ensuring that we have a View for each open file contained in a workspace folder.didChangeConfigurationanddidChangeWorkspaceFolderscauses gopls to updateFolderlayout and configuration. Note that configuration may affect e.g. GOWORK values and therefore may lead to a new set of Views.didChangeordidChangeWatchedFilemay cause gopls to re-evaluate Views if the change is to ago.modorgo.workfile (for example, ago.modfile deleted or added, or ago.workfile changed in any way).
Following these changes, gopls will re-run the algorithm above to determine a new set of Views. It will re-use existing Views that have not changed.
Whenever new Views are created, they are reinitialized as above.
Differences from the current model
The algorithms described above are not vastly divergent from gopls’ current behavior. The significant differences may be summarized as follows:
- Rather than having one view per folder, we may have multiple Views per folder (or even multiple folders per view, such as the case where several folders resolve to a common go.work file).
- The set of ‘workspace modules’ in a given View is static. As
go.modfiles are added or removed, orgo.workfiles changed, we reconfigure the set of Views. This simplifies the logic of handling metadata invalidation in each view.
Downsides
While this change will make gopls “do the right thing” in more cases, there are a several notable downsides:
- Users may experience increased memory usage simply due to the fact that gopls successfully loads more packages. We are working on a separate redesign that will allow us to hold significantly less information in memory per view.
- When opening a new workspace, if there are no open files gopls may resolve less information, for example if there is no
go.workorgo.modin a parent directory of the workspace folder. This means thatworkspace/symbolsrequests may return empty results, or results that depend on the set of open files. Users can mitigate this by using ago.workfile. - Workspace-wide queries such as “find references” may become more confusing to users. For example if the user has modules
aandbin their workspace, andadepends on a version ofbin the module cache, find reference on a symbol in thebdirectory of the workspace will not include references ina. Users can mitigate this by using ago.workfile. It would also be possible for us to implement looser heuristics in our references search.
Future extension to build tags
By decoupling Views from workspace folders, it becomes possible for gopls to support working on multiple sets of build tags simultaneously. One can imagine that the algorithm above to compute views based on open files could be extended to GOOS and GOARCH: if an open file is not included in an existing view because of its GOOS or GOARCH build constraints, create a new view with updated environment.
The downsides above apply: potentially increased memory, and potentially confusing UX as the behavior of certain workspace-wide queries (such as references or workspace symbols) depends on the set of open files. We can work to mitigate these downsides, and in my opinion they do not outweigh the upsides, as these queries simply don't work in the current model.
Task List
Here's an approximate plan of attack for implementing this feature, which I'm aiming to complete by the end of the year. (migrated from #57979 (comment)).
This is inside baseball, but may be interesting to @adonovan and @hyangah.
Phase 1: making Views immutable:
- Port all the completion tests to the new marker framework. The old marker framework toggles options before every single completion assertion, and if that causes the views to be recreated, the tests run far too slow.
- Remove the "minorOptionsChange" logic: every new configuration should result in a new view. The existing logic to avoid recreating the view is almost certainly a premature optimization.
- Extract the
Session.getWorkspaceInformationlogic to be unit-testable, and rename/refactor. This will be the basis of the new workspace algorithm. - Isolate "folder" information into the
Foldertype, to reuse across new/multiple views. - Invert the control of
NewView, which should no longer query workspace information, but rather should be provided immutable workspace and folder information. - Move management of the go.work and workspace mod files outside of the view, and into what is currently called
getWorkspaceInformation. Whenever ago.workfile changes, create a new view (at least if the newgo.workparses and is saved to disk). - Move all the mutable view state (such as vulnerabilities) into the snapshot. This is actually necessary for correctness anyway.
At this point, gopls should still behave identically, but Views will have immutable options and main modules. There may be a bit more churn when configuration or go.work files change, but such events should be very infrequent, and it seems reasonable to reload the workspace when they occur.
Phase 2: supporting multiple views per folder
- Where it makes sense, multiplex LSP queries across all views. We currently do this only for WorkspaceSymbols, but it makes sense for all workspace-wide queries such as finding references or implementations.
- Rewrite the diagnostic logic to merge diagnostics from different views. I'm not sure exactly how this will work: the current logic enforces freshness by tracking a monotonic snapshot counter, but that ends up being rather complicated, and we should do better.
- Change the
bestViewForURIlogic to return nil if no view matches, and lift upsnapshot.containsto theView, since views are now immutable. - On any didOpen/didClose/didChangeConfiguration/didChange*(of go.work files), recompute the set of
Viewsnecessary to cover all open files (computeViews). Compute the diff with the current set, and minimally (re)create views that are necessary. - Move memoizedFS to each view, so that it is naturally reset whenever views change. Since views affect the set of file watchers, changes to the set of views may pull in files whose changes weren't observed, so we need to re-read.
Phase 3: support for multiple GOOS/GOARCH combinations
- Extend the
computeViewsalgorithm to consider GOOS and GOARCH combinations. I'm not sure exactly how this algorithm will work: presumably if the current GOOS/GOARCH combination doesn't match an open file, we'll pick another, but the algorithm to pick another is non-trivial.