Things to know if you are a new contributor to LibreOffice code
When I began contributing code to LibreOffice, I faced some issues because I didn't know several facts that the other active contributors knew. This blog post summarizes some of those facts, and I hope it will be useful for other new contributors!
1) The data types LibreOffice code uses -
If you have already browsed through the codebase of LibreOffice, you might have realized that the code doesn't generally use data types like int, float, double, et cetera like we use in small C++ programs. LibreOffice has its own classes, typedefs and structs defined and most of the code use them instead of the trivial data types. Some of those are OUString (instead of string), sal_uInt32 (instead of unsigned int), etc.
2) The CI infrastructure -
As soon as you send a patch to Gerrit, Jenkins starts to build LibreOffice from your code on several platforms to check whether or not the change you made compiles successfully. Not only that, but it also runs the built software against several test cases and also checks whether or not your code is properly formatted. You are expected to check the output of Jenkins and fix any error it shows. But here's the catch - many of the tests LibreOffice has are flaky - which means the tests might fail even when it isn't your fault. You have to carefully examine the output of Jenkins, and if the error produced is irrelevant to your change, you simply have to rebase your change on the latest commit (which can be done using the Gerrit UI), and wait for Jenkins to build your code again.
3) The directory structure -
Though the presence of a large number of files in the LibreOffice core project might make it difficult for you to search for a particular code, having a knowledge of how the directories are structured might be helpful for you. The following are some important points-
If the file you are working with uses some letters as prefixes in the names of local variables, it'd be great if you use the same prefixes in the patch you send. The prefixes generally denote the data type of the variable. The meaning of some of the prefixes are listed below-
1) The data types LibreOffice code uses -
If you have already browsed through the codebase of LibreOffice, you might have realized that the code doesn't generally use data types like int, float, double, et cetera like we use in small C++ programs. LibreOffice has its own classes, typedefs and structs defined and most of the code use them instead of the trivial data types. Some of those are OUString (instead of string), sal_uInt32 (instead of unsigned int), etc.
2) The CI infrastructure -
As soon as you send a patch to Gerrit, Jenkins starts to build LibreOffice from your code on several platforms to check whether or not the change you made compiles successfully. Not only that, but it also runs the built software against several test cases and also checks whether or not your code is properly formatted. You are expected to check the output of Jenkins and fix any error it shows. But here's the catch - many of the tests LibreOffice has are flaky - which means the tests might fail even when it isn't your fault. You have to carefully examine the output of Jenkins, and if the error produced is irrelevant to your change, you simply have to rebase your change on the latest commit (which can be done using the Gerrit UI), and wait for Jenkins to build your code again.
3) The directory structure -
Though the presence of a large number of files in the LibreOffice core project might make it difficult for you to search for a particular code, having a knowledge of how the directories are structured might be helpful for you. The following are some important points-
- Almost all directories have a README file which gives a bit of introduction of what kind of code the directory contains.
- Many directories have a subdirectory named qa. QA stands for quality assurance and they contain the tests for the code implemented in the same directory.
- Many directories have a subdirectory named include/inc. These folders contain header files.
- Many directories have a subdirectory named source. This is the main source code of the directory the README file talks about.
If the file you are working with uses some letters as prefixes in the names of local variables, it'd be great if you use the same prefixes in the patch you send. The prefixes generally denote the data type of the variable. The meaning of some of the prefixes are listed below-
Prefix | Meaning |
---|---|
b | Boolean |
p | Pointer |
r | Reference |
n | Integer/Numeric Type |
c | Character |
a | Anything else |
Hi! Nice post; great that you share your knowledge.
ReplyDeleteSome nitpicks:
1. double is actually the main floating-point type used in the codebase;
2. When your CI job has errored out, it's better to ask others on IRC to resume, than immediately rebase; that's because some platforms that succeeded would not be re-tested when resumed, but will be if rebased - thus resuming helps reducing overall load to CI infrastructure.
3. The code conventions documentation is mentioned at https://wiki.documentfoundation.org/Development; there are also type prefixes like x (for interfaces), e (for enums); and also the other class of prefixes identifying if it's a member (m_), a static (s_).
Searching for a tip on Jenkins failing, I found your post. Thanks :)
ReplyDelete