Question

IFS Cloud - FTP to File Share - Change Docman Pointers

  • 12 January 2024
  • 12 replies
  • 221 views

Userlevel 5
Badge +10

All,

We are upgrading to IFS Cloud 23R2 (Remote) from Apps 10, where we have historically used FTP as a repository type. 

Since this is no longer supported, we are moving to File Share and use of a Windows share instead.

We have manually moved the files from FTP to File Share.

I am looking for guidance on what I need to do to change the "file pointer" in Docman as well as add some metadata used by File Storage please for all our existing documents?

Does someone have a pre-existing script they would be willing to provide? 

Thanks in advance

Shaun


12 replies

Userlevel 7
Badge +30

Hi Shaun,

First you should migrate from FTP to Shared repo. This is easy and I will soon post a document here that should help.

Once that is done, you need to use the File Storage Migration Tool, documented here:

https://docs.ifs.com/techdocs/23r2/030_administration/210_cloud_file_storage/fsmigtool/

You should read all of it since it has a lot of useful and important information about the migration and using the tool.

Then you should focus on this section and the relevant sections that cover migration from FTP or Shared repos :

https://docs.ifs.com/techdocs/23r2/030_administration/210_cloud_file_storage/fsmigtool/#ftp_and_shared_repository_support

In theory you could move the files manually to the shared used by File Storage but the problem is that it sets some special file attributes which only the service/projection used by the File Storage migration tool will do for you.

 

Userlevel 5
Badge +10

Hi @Mathias Dahl 

I have read the previously provided link to the tool, and I downloaded this to my machine. 

However we have already copied the files across to the Shared folder so don’t need the tool to do this. The files need to remain in the FTP repository as this is being used for Apps 10 until we go Live in June, hence a copy only. It is purely the updating of the Doc Man tables that needs completing. 

I can’t see anything in this tool which performs this step, and specifically it is just the population of fss_file_tab that needs performing I believe.

Also the documentation seems to contradict it self, as it states FTP repository is supported  but then it goes on to talk about valid documents and these need to be in Database repository type!

We therefore don’t have a source database, and I am not sure how we can complete the installation steps.

Regards

Shaun

Userlevel 7
Badge +30

@SHAUN_KERSLAKE 

I have read the previously provided link to the tool, and I downloaded this to my machine. 

Great!

However we have already copied the files across to the Shared folder so don’t need the tool to do this.

Unfortunately there is no way today to make the tool not copy the files from the source location. 

I can’t see anything in this tool which performs this step, and specifically it is just the population of fss_file_tab that needs performing I believe.

Populating the fss_file_tab is very important, yes, that's how File Storage knows what files it has stored.

The FS mig tool uses a projection named FssMigrationHandling to migrate the files (which you can all yourself if you have the means to do so). Apart from copying the file content and creating records in fss_file_tab, it also sets some attributes in the copied files. That's why the tool needs to be used.

Also the documentation seems to contradict it self, as it states FTP repository is supported  but then it goes on to talk about valid documents and these need to be in Database repository type!

Thanks for reporting that, we will fix that.

We therefore don’t have a source database, and I am not sure how we can complete the installation steps.

I think you do have a source database, running Apps 10. That's where you need to install the necessary PL/SQL package (using the FsMigTool.sql) that the FS mig tool uses. You can ignore the error in the documentation that you mentioned above. 

I'll try to get hold of the information on what file attributes that needs to be set in the files that FS keeps track of. Perhaps you can add them manually.

It might be easier to just let the tool do all the work though, creating records in fss_file_tab plus setting the right attributes. You will also get logging and status tracking of the migration.

 

Userlevel 7
Badge +30

Hi again,

Here is the document I promised. I started writing it today so it might not be complete but, hopefully, it contains information that can be useful in your file migration. Let me know what you think. I will post this separately later and it would be great with feedback on the contents.

Changing document repository

About this document

The purpose of this document is to give you the knowledge needed to "manually" migrate files between document repositories. It's important to first understand how Docman keeps track of the files that is stored in IFS.

Data model

These are the entities involved when document files are stored and retrieved.

Apart from just reading the below, it is a good idea to actually look at the contents of the database tables for the entities. Just by looking at the data in EdmFile and EdmLocation, together with the data in your repostitory, should give you a good idea on how things fit together.

EdmFile (edm_file_tab)

This is what we call the "file reference" of document / document revision. This entity is used to store the original file name as well as the file name in the document repository. Another important attribute is the storage location.

Here are the most important attributes:

  • FileName: This is the file name used in the repository / backend. It's not relevant for Database repositories, but it's important for the other repository types. This is usually generated by the system and it MUST be unique in the folder/place where the files are stored.
  • LocationName: This is the name of the document repository where the file is stored. It points to/references a "location" (repository) in EdmLocation
  • Path: If the file is not stored in the "root" of the repository, then the path is kept here.

EdmFileStorage (edm_file_storage_tab)

Used in Database story only. This is where the file information is stored. One record per file stored.

Here are the most important attributes:

  • FileData: This is a BLOB column (binary large object) that can keep large file content.

EdmLocation (edm_location_tab)

This is where the repositories are defined, which are pointed to/referenced by the EdmFile entity (file reference).

  • LocationName: The repository name. This is the ID/name and its the primary key of the repository.
  • LocationType: The repository type. This is a number and can have one of the following values: 1) Shared, 2) FTP, 3) Database, and 4) File Storage
  • LocationAddress: Relevant for Shared and FTP repositories. The address as defined in EdmLocationUser.
  • Path: if the repository type allows to point to a specific default value for the folder/directory inside the repository, it's set here. Relevant only for Shared and FTP repositories.

EdmLocationUser (edm_location_user_tab)

  • LocationAddress: The IP addres or name of the server. Relevant only for Shared and FTP repositories.
  • LocationUser: The username used to log in to the server.
  • LocationPassword: The password used to log in to the server.

Scenarios

FTP or Shared server name needs to change

This scenario is very simple to handle: just update the repository address in the EdmLocation and EdmLocationUser entities.

Change repository type

This can be easy or hard depending on what the change is.

FTP <-> Shared

Changing between FTP and Shared repositories is quite simple:

  • Move the files (if needed, they might already be stored in the correct physical location)
  • Change the repository type in EdmLocation (this needs to be done from an SQL tool)
  • Change the Repository Address to have a value that matches the repository type
  • Make sure the Path in EdmFile, of not empty, exists in the new repo and that the file is stored there.

FTP/Shared -> Database

This is more complex because the files cannot just be moved using normal file copy tools.

  • Change repository type in the EdmLocation entity.
  • Read file content from the FTP or Shared folder and write file to EdmFileStorage.FileData. The record should be removed from EdmFileStorage once the data has been read from it and written to a file in the new repository.
  • Add Path information in EdmFile if a file has been moved to a FTP or Shared repository and if the file is not stored in the root (top) folder.

FTP/Shared/Database -> File Storage

This is the most complex option because File Storage (the IFS Cloud File Storage service/capability of the platform) has its own file reference registry. For each file that Docman keeps, and apart from the EdmFile record in Docman, a record is created in the file reference of File Storage.

In order to understand how to move a file from FTP/Shared/Database in Docman, we need to understand how File Storage keeps information about the files it stores.

FssFile (fss_file_tab))

This is where the file references are stored in File Storage.

These are the important attributes:

  • FileId: This is the unique ID of the file which is used as the ID/key in the back end to File Storage. In the case of Docman, we use the FileName from EdmFile, since that is unique.
  • LuName: The owner of the data. Meant to be an LU name, but in the case of Docman, the value is "docman".
  • FileName: This is the original, or "nice" filename. It's secondary and just extra metadata in the backend storage. In the case of Docman, the original file name is used (EdmFile.UserFileName)
  • FileExtension: The file extension, in lowercase.
  • FileType: Seems to be hard coded to "application/x-www-form-urlencoded".
  • FileLength: This MUST be set to the length of the file content, in bytes.
  • RowState: Will have the value "Active" once the file has been virus scanned. The vlaue "QueuedForScanning" means the file is in the queue to be scanned. If any other value, something went wrong. If you want to disable virus scanning during migration, set this value to "ScanDisabled".

Use the Fss_File_API.Create_File_Ref_Rec method to create records in this entity.

If records are created here by INSERT statements, remember to populate also the RowKey and RowVersion attributes with a GUID and SYSDATE, respectively.

Steps

These would be the theoretical steps to move a file to File Storage:

  • Create the record in the FssFile entity. FileId is very important and is the FileName attribute from EdmFile (see the earlier section for information on the other attributes in FssFile).
  • Change the repository type in EdmLocation OR update LocationName on all relevant records in EdmFile to the new repository that uses File Storage.
  • Copy the file to the docman folder in the share controlled by File Storage. Do not create any extra sub folder since this is currently not supported in File Storage.
Special file attributes

It has been reported that File Storage sets some special file attributes in Remote. This does not happen from the File Storage service itself so it might be Kubernetes that does this. We currently don't know if this is important or not for the operation.

Thanks!

Userlevel 5
Badge +10

@Mathias Dahl Thanks for providing this information.

 

Are you able to provide further guidance around how we populate these two fields in fss_file_tab:

  • FileExtension: The file extension, in lowercase. Where is file extension currently stored in IFS as don’t see this in edm_file table? Unless expectation is some SQL to extract the extension from the file_name field. 
  • FileLength: This MUST be set to the length of the file content, in bytes. How or where do we obtain the length of file content in order to populate?

Thanks

Userlevel 7
Badge +30

@SHAUN_KERSLAKE 

To get the file extension, how to do it depends on what tools you have available how you extract the file extension, and your personal preferences. I would probably just use SUBSTR and INSTR to extract it from the FileName attribute.

Here is how to do it:

SUBSTR(file_name_, INSTR(file_name_, '.', -1) + 1)

Another way, which requires more work for Oracle is to join EdmFile with EdmApplication (which keeps the file type - file extension mapping) and get it from there.

As for the file length, you have to get that by your scripting tool of choice. I asked ChatGPT and it came up with this cmd script:


@echo off

setlocal enabledelayedexpansion

set "inputFile=filelist.txt"

set "outputFile=filesizes.txt"

if not exist "!inputFile!" (

    echo Input file not found.

    exit /b 1

)

rem Delete output file if it already exists

if exist "!outputFile!" del "!outputFile!"

for /f "tokens=*" %%a in (!inputFile!) do (

    set "filename=%%a"

    

    if exist "!filename!" (

        set "size=0"

        for /f %%b in ('powershell -command "(Get-Item '!filename!').length"') do (

            set "size=%%b"

        )

        echo !filename!;!size! >> "!outputFile!"

    ) else (

        echo !filename!;File not found >> "!outputFile!"

    )

)

echo Operation complete. Output written to "!outputFile!"

endlocal

Here is a link to the chat:

https://chat.openai.com/share/3b583521-4596-431e-b01a-7736654816e8

As you can see, the first attempt didn't work so I had to ask it to find another way, and that seems to work.
 

Userlevel 5
Badge +10

@Mathias Dahl  Thanks the file extension part works! 

Just to clarify for the file length, I need to export all the file names from IFS into a text file, and run this script against our Windows Share directory or FTP repository where the files are held to obtain the length?

Then I assume we need to come up with a way of getting this information back into IFS by possibly loading it into a temp Oracle table to hold the data, and then read from this before inserting into fss_file_tab.

Userlevel 7
Badge +30

@SHAUN_KERSLAKE 

Just to clarify for the file length, I need to export all the file names from IFS into a text file, and run this script against our Windows Share directory or FTP repository where the files are held to obtain the length?

Yes. But I'll have someone comment here on how important the file length column is in this context such that you don't do that work in vain…

@chanaka-shanil, can you comment on how the File Length attribute in FssFile is used and how critical it is? 

Then I assume we need to come up with a way of getting this information back into IFS by possibly loading it into a temp Oracle table to hold the data, and then read from this before inserting into fss_file_tab.

If you fetch the size for each file into a text file, it should be a small thing to convert that text file into a number of INSERT statements or, better, calls to Fss_File_API (as per my "documentation" above)

That's a perfect task to give ChatGPT as well. Here is one possible conversation:

https://chat.openai.com/share/c93e9999-b4ed-40c6-ba28-0b526afff7d4

Tweak as necessary.

If you have access to a powerful text editor with macro capabilities, like Emacs or Notepad++, transforming the file names and file lengths into an SQL script is also not hard. Or you can use free command line tools like AWK to do it. Or PowerShell, or Python or ... 🙂 

 

Userlevel 5
Badge +10

@Mathias Dahl our IT have been able to get a csv file created listing the file name and the length separated by a comma. I have then used IFS data migration tool to upload this file into a table, and I can then use this table to link to edm_file in a pl/sql cursor as part of the upload into fss_file_tab. 

We have a number of records in EDM_FILE that have no link to a file name and length, this may be due to the file not actually existing on FTP/File Share. If I default the length to 0 in this scenario I assume this is fine?  Appreciate waiting @chanaka-shanil to comment on this. 

Also in EDM_FILE we have files with a doc type of ORIGINAL and VIEW. Are we only interested in uploading those with ORIGINAL into fss_file_tab?

Please confirm.

Userlevel 7
Badge +30

@SHAUN_KERSLAKE 

We have a number of records in EDM_FILE that have no link to a file name and length, this may be due to the file not actually existing on FTP/File Share. If I default the length to 0 in this scenario I assume this is fine?  Appreciate waiting @chanaka-shanil to comment on this.

A record in EdmFile that has no value in FileName I would deem as "empty". Actually, the FileName attribute is mandatory so I wonder how that record was ever created. Can you confirm that it's the FileName attribute (file_name column) you are talking about, and not UserFileName (original file name when the user uploaded the file to Docman)?

Also in EDM_FILE we have files with a doc type of ORIGINAL and VIEW. Are we only interested in uploading those with ORIGINAL into fss_file_tab?

It depends, and this is yet again a scenario that the FS mig tool will try to handle for you. I cannot describe here all the scenarios and how to handle each one but I can say this:

  1. If you check in a PDF file to an empty document revision, and PDF files are set as Document Type = VIEW, in our basic data, then the system will create TWO records pointing to the same file, one with Document Type = VIEW, and one with ORIGINAL.
  2. If you check in, say, a Word file and the user also has a PDF file with the same base name on disk, the system will upload both files.

In scenario 1, there is really only one file, not two as the records in EdmFile says.

In scenario 2 there are two different files, and you probably want to keep both (at the very least the original).

Again, the tool would handle this case for you and do the right thing in both scenarios. Since you do this work "manually" you need to handle this scenario on your own.

Is it clear?
 

Userlevel 5
Badge +10

Morning @Mathias Dahl 

So to clarify, from IFS point of view the document exists as it is shown in EDM_FILE_TAB with all the fields populated. However there are two scenarios:

1 - when we search for the corresponding file(s) in the repository (FTP repository/File Share), it does not exist. 

2 - File exists in the repository (FTP repository/File Share) but it’s file content is empty; no file length.

IFS has no knowledge whether the file actually exists in the repository or if the file has content when the repository is FTP, and it is only when the user clicks ‘View document’ do you get the error returned.

Through IT providing me with a list of all the file names and their length in a csv from the repository, and using this to cross reference against the files in edm_file_tab I can see those with no file length and hence suggesting I default to 0. 

I am now inclined to ignore these all together in the upload to the Cloud file share table (fss_file_tab) to be honest.

Regards

Shaun

Userlevel 7
Badge +30

@SHAUN_KERSLAKE 

> So to clarify, from IFS point of view the document exists as it is shown in EDM_FILE_TAB with all the fields populated.

You could say that, yes. We don't have complete control of the storage location. Files can be removed or changed without IFS knowing it, and it can happen even for files that are kept in the database. So the best thing we have is what's in EdmFile. In order to be more sure, IFS would need to periodically scan the storage locations, making sure all files are there, that we think should be there. But we don't do anything like it.

1 - when we search for the corresponding file(s) in the repository (FTP repository/File Share), it does not exist.

There can be several reasons for this:

  • It should not happen often, but if something goes wrong during check-in/upload, there are scenarios where we have failed to rollback the data in EdmFile. This is something we are working on and polishing over time, but there are cases over which we have no control. So, the content in EdmFile must not represent the truth, even if IFS is the only "actor".
  • Someone with access to the storage location might have removed or changed the file.
  • Hardware or system failures

2 - File exists in the repository (FTP repository/File Share) but it’s file content is empty; no file length.

Although it's not very common and also not very valuable, an empty file is allowed, both in all major file systems, and in Docman. What I would do, for these documents, would be to remove the files and also remove the record in EdmFile. There is very little value in keeping the files apart from that it tells us it's file type.

IFS has no knowledge whether the file actually exists in the repository or if the file has content when the repository is FTP, and it is only when the user clicks ‘View document’ do you get the error returned.

Correct. Here, IFS Docman is more of a registry than a file system. An analogy is a book listing in a library. The book might be there in the listing but someone stole it or removed it from the shelve without telling.

Through IT providing me with a list of all the file names and their length in a csv from the repository, and using this to cross reference against the files in edm_file_tab I can see those with no file length and hence suggesting I default to 0.

No file length = a file length of 0 (zero), yes.

I am now inclined to ignore these all together in the upload to the Cloud file share table (fss_file_tab) to be honest.

As suggested from the above, I'm inclined to agree 🙂 
 

Reply