I am trying to call Apache-TIKA via their REST API.
I have successfully been able to upload a PDF document and return the document's text via CURL
curl -X PUT --data-binary @<filename>.pdf http://localhost:9998/tika --header "Content-type: application/pdf"
That translated to INDY like so:
function GetPDFText(const FileName: String): String;
var
IdHTTP: TIdHTTP;
Params: TIdMultiPartFormDataStream;
begin
IdHTTP := TIdTTP.Create;
try
Params := TIdMultiPartFormDataStream.Create;
try
Params.Add('file', FileName, 'application/pdf')
Result := IdHTTP.PUT('http://localhost:9998/tika', Params);
finally
Params.Free;
end;
finally
IdHTTP.Free;
end;
end;
Now I want to upload a word document (.docx) I assumed that all I would need to do is change the Content-type when I add the file to Params, but that doesn't seem to produce any results, although I get no error reported back. I was able to get the following CURL command to work correctly
CURL -T <myDOCXfile>.docx http://localhost:9998/tika --header "Content-type: application/vnd.openxmlformats-officedocument.wordprocessingml.document"
Any ideas on how to modify the INDY code?