nibra / fix-copyright
A one-shot project created to convert the copyright statements in the Joomla project to a standardised format. It might be useful for other projects as well.
Requires (Dev)
- phpunit/phpunit: ^8.5
README
This tool is a one-shot project created to convert the copyright statements in the Joomla project to a standardised format. It might be useful for other projects as well.
Why bother
Until 2020, Joomla used to use this format:
Copyright (C) 2005 - 2020 Open Source Matters. All rights reserved.
The ending year of the range had to be updated each year for each file. Although this update was done by a script, this process kept polluting the file history all the time without any benefit.
Thus, the Production Department leadership of Joomla decided in mid-2020 to follow the advice in this excellent article about how and why to properly write copyright statements.
Some people consider this change pointless, but, as Michael Babker nailed it,
This might come across as a pointless change, but the other pointless change is modifying every file in every Joomla owned repository in January to amend the ending date of the copyright claim. Additionally, a copyright claim is being made on every file in most every Joomla owned repository of a copyright dating back to 2005, which is clearly not factual.
How it works
The original approach to determine the creation date for a file was
YEAR=$(git log --follow --date=format:%Y --pretty=format:"%cd" --diff-filter=A --find-renames=40% "${FILE}" | tail -n 1)
However, the results were disappointing. git
itself uses content similarity to find renames, which led to
unexpected results.
The Git > Show History
function of PhpStorm, on the other hand, gave very plausible results for the first commit,
and some research revealed the implementation in IntelliJ (which is the base for PhpStorm).
The people at JetBrains found that
git log --follow
does detect renames, but it has a bug - merge commits aren't handled properly: they just disappear from the history. See http://kerneltrap.org/mailarchive/git/2009/1/30/4861054 and the whole thread about that: --follow is buggy, but maybe it won't be fixed.
The solution, which is re-implemented here, is to
- Get the first commit of the file with that name
- Get the status (
Added
,Copied
orRenamed
) of that commit - Stop, if status is
Added
orCopied
, this really is the first commit. - Status is
Renamed
, so get the first commit of the file with the previous name before the current commit. - Continue with step 2.
How to adopt the scripts for your environment
In fix-copyright.sh
, change lines 4-7 to suit your settings:
GREP_PATTERN="(Copyright )?\(C\) .* Open Source Matters.*All rights reserved\.?" SED_PATTERN="\(Copyright \)\?(C) .* Open Source Matters.*All rights reserved\.\?" OWNER="Open Source Matters, Inc." CONTACT="https://www.joomla.org"
Be aware of the different kinds of escaping for grep
rsp. sed
.
You might want to adjust the default year in lines 18 and 20 (here the default year is 2005
):
if [[ ${FILE} == *.xml ]]; then REPLACEMENT="(C) ${YEAR:-2005} ${OWNER}" else REPLACEMENT="(C) ${YEAR:-2005} ${OWNER} <${CONTACT}>" fi
ToDo
- Move functionality from
fix-copyright.sh
tofix-copyright.php
- Provide
PATTERN
,OWNER
andCONTACT
as command line parameters - Escape pattern internally